Forum archive 2000-2006

D. Winslow - parsing of entries

D. Winslow - parsing of entries

by Arnold Pizer -
Number of replies: 0
inactiveTopicparsing of entries topic started 3/15/2005; 9:05:01 PM
last post 3/16/2005; 8:07:30 PM
userD. Winslow - parsing of entries  blueArrow
3/15/2005; 9:05:01 PM (reads: 978, responses: 2)
We are presently using MapleTA for large lecture sections of a business calculus course. There is a Maple-maple mode option for certain problems in which the student entry is passed for grading as a text entry with no parsing. We have to use this for specific types of problems in which certain characters are not allowed as entries in other modes such as interval notation, set notation, etc. We then have routines that check these unparsed entries for correctness. There is now a need to write a parser for this mode that will parse formulas and numerical entries in this mode. In trying to model what MapleTA does in other modes, I found that the system is inconsistent in how exponentiation is handled. In strict text entry mode, x^y^z is parsed as (x^y)^z. There is also a symbol mode in which users are told that keystrokes as well as a symbol palette can be used in which the same entry is parsed as x^(y^z). I think there should be consistency between modes, and I thought that the former parsing was chosen in text mode to model how calculators interpret such entries, but it turns out that the parsing is not completely consistent with calculator interpretation either.

I notice that WebWork interprets x^y^z as x^(y^z) and that probably should be the convention if there is to be one. My question is this: If parentheses are omitted prior to a function argument, opening parentheses are inserted and then closed prior to any operation in WebWork v1.9 (lnx+2->ln(x)+2,sinx^2->sin(x)^2). However sin-x^2->sin(-x^2), and I believe this probably should be the convention considering order of operations. Then shouldn't sin--x^2 be parsed the same as sinx^2 (it is not in WebWork v1.9), and if not is there a strong argument not to do so? It would be very easy to have a parser do this, and I saw some discussion of this being the general rule in more current versions of WebWork. How is this handled in the current WebWork release and can the parser be configured to accommodate either parsing convention?

David

<| Post or View Comments |>


userDavide P. Cervone - Re: parsing of entries  blueArrow
3/16/2005; 8:14:04 AM (reads: 1269, responses: 0)
David:

I don't know anything about MapleTA, but if you are writing your mode-specific code in Perl, then you might consider using the new parser code that I wrote for WeBWorK. It is actually a stand-alone library (actually two libraries) that you could incoporate into other Perl programs, as all the WeBWorK-specific material has been isolated in one file that you would not load. That would save you a lot of work (unless you have already written your own). To do this, you would need to the Parser and Value directory trees from the webwork pg/lib directory, plus Value.pl and Parser.pl from the pg/macros directory. You would need to modify these latter slightly to remove a few WeBWorK items (and put back a few commented out items), but that part is not hard. The parser already implements interval notation and unions of intervals, vectors and points, matrices, complex numbers, and infinities, but it doesn't do set notation, though that could probably be added.

One of the design goals for the new parser was to allow it to be extendable, so you can add (or remove) operators, functions, types of parentheses, and so on, and can crete specialized sub-classes of the predefined object classes. You can also adjust the operator precedences and associativity to suit your needs.

For example, you mention x^y^z. The usual rules of precedence say that raising to a power is right-associative, meaning it should be interpreted as x^(y^z), as WeBWorK currently does. If you wanted it to be (x^y)^z, however, you could do that by changing an entry in the precednece table, and that could be done on a problem-by-problem or course-by-course basis.

The issue of how to handle function calls that do not include explict parentheses is most subtle, and there is no really good solution to it. As you note, we have had some discussions of this in the past, and there is some disagreement among the WeBWorK developers about what the "right" approach is. While the rules of precendence of operators are pretty clear, there is no recognized standard for handling function calls with missing parens.

Your example of sin-x^2 is an important one, and illustrates some of the subtley involved. Here, sine is trying to bind to the first operand to the right of it, but since the rules of precedence say that -x^2 is interpreted as -(x^2) not (-x)^2, this makes the "first operand to the right" is all of -(x^2). So you get sin(-(x^2)). The same thing is true of your second example for sine: --x^2 is -(-(x^2)), not (--x)^2, so I think it is reasonable to make sin--x^2 return sin(-(-(x^2))). [In my opinion, the problem is really with sin x^2 returning sin(x)^2 rather than sin(x^2), but I don't want to reopen that debate at this point.]

The solution that the new parser uses is to consider "function apply" (with no explicit parens) as an operator similar to implicit multiplication. This operator gets a precedence like any other, and the setting of that precedence affects how it interacts with other operators. For example, setting the precedence of function apply to be equal to that of implied multiplication would make sin x^2 be interpretted as sin(x^2) while ln x+2 would still be ln(x)+2. (In this case sin 2x would be sin(2)*x, just as it is now; setting the precedence to between that of multiplication and exponentiation would make sin 2x become sin(2x).) Setting the precendence to be higher than exponentiation would make sin x^2 become (sin(x))^2.

The default settings of the precedences in the new parser mimic those of WeBWorK 1.9, but there is an experimental set of precedences that try to make a more "natural" set of interpretations. It is not entirely successful, but has promise. The parser is also set up to allow issues of spacing to be considered when deciding on precedence, but this is somewhat controvercial, and I won't get into it here. I only mention it so that if you want to use the new parser in your own code, you can use those features if you wish. The nice thing about it is that it is all based on precedence rules, so there are no new ideas or special cases involved; it's just that there are more operators than usual (like the function apply operation, or the fact that implied multiplication via a space is a separate operator from explicit multiplication with *, and so they can have different precedences).

Anyway, good luck with the project.

Davide

<| Post or View Comments |>


userD. Winslow - Re: parsing of entries  blueArrow
3/16/2005; 8:07:30 PM (reads: 1236, responses: 0)
Davide,

Thanks for the explanation concerning sin--x^2. I actually have a parser ready that can be easily modified, but am having difficulty deciding what the default settings should be. My first inclination was to set the parser to mimic what most TI and Casio calculators perform with regard to function evaluation and exponentiation thinking that is how MapleTA modeled their parser and students would be on familiar ground. The MapleTA parser does not exactly mimic calculators. I can set up the parser to do this but this would add a third interpretation of the same text input and add to the inconsistency and confusion. The calculators I referred to above interpret x^y^z as (x^y)^z, but interpret x^-y^z as x^(-y^z). For the same reason you give in your response, I believe the latter is the correct parsing. What is difficult to understand is the interpretation of x^-y^z^-w^t. This is calculated as (x^(-y^z))^(-w^t). To have the parser do this type of parsing required adding another subroutine and it was somewhat harder. Since this is closest to what MapleTA does now in text mode, I am considering this as the default although I disagree with the parsing of multiple exponentiation.

My first choice would be to parse exactly as WebWork does now. If text were parsed in other modes in the same manner, this makes the most sense to me. As far as function argument parentheses insertion, I insert no opening parenthesis if one is detected following a known function name. If not, an opening parenthesis is inserted and is closed at the first opportunity prior to a major operation. If a "-" character is detected immediately after a known function name, the rules change somewhat but follow standard conventions. I can change this so that closing occurs after exponentiation, after multiplication, etc but hesitate to do so (first, do no harm).

The parser could have been written in Perl, but is written as a Maple routine since we can add the routines to the Maple library and access the routine by name for specific problems. The routine has a global functions list and global constants list. It has also has the option to add a new function name or a long variable name on a per problem basis if a problem needs this. I spent some time on this parser, but setting the default options has become the real dilemma for me.

<| Post or View Comments |>