-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
%nonassoc not handled correctly #39
Comments
So I am closing this issue meanwhile; if you think this is not right, please feel free to reopen and comment. |
I don't agree with @mrgleba. The definition of non-associative operators is exactly that it is illegal to combine two or more of these without explicit parentheses. See https://en.wikipedia.org/wiki/Operator_associativity#Non-associative_operators So there is nothing bogus about removing both the shift and the reduce action at the conflict. Nor does it require "wild changes in the construction of the automaton". You simply remove both actions after generating the automaton, just as you would remove the shift action in conflicts with left-associative operators and the reduce action in conflicts with right-associative operators. Besides, as it is implemented now, use of %nonassoc will only resolve precedence conflicts and not associativity conflicts, so it is useful only for prefix or suffix operators and not for infix operators. And for prefix or suffix operators, using %left or %right works equally well. So %nonassoc is (in the current implementation) useless. |
What I meant is: %nonassoc OP expr: is bogus: an operator declared explicitely as non-assoc is uassed as an associative operator. Accepting this and reporting this only at runtime would require an extra states to track if OP has already been used in expr. In cases when you want this you do: %nonassoc OP expr: expr1: I cannot see how this transformation could be addd easily in the current implementation. Generally %nonassoc is used for explicit precedence and as a sanity-check to give a warning when you use it as an associative operator. IMHO no more should be expected of it. |
The example was deliberately reduced to a minimum. I agree that for this particular case, it would be better to rewrite the grammar. But if you have a large number of infix operators, some of which are (left or right) associative and some of which are non-associative, it becomes cumbersome. As I said, the definition of non-associative is clear enough: It does not prohibit using the operator as an infix operator, it just forbids self-combination without explicit parentheses. Your argument for why you should rewrite the grammar instead of using precedence declarations applies equally well to all other cases of precedence and associativity conflicts: The "safest" thing to do is to explicitly rewrite the grammar. But precedence declarations are much more convenient (and gives smaller and faster parsers). As long as it is clear how conflicts are handled by precedence declarations, I can not see any problem. |
One thing that could be considered is adding two new declarations: %prefix and %suffix. These declare the operators to be prefix or suffix operators, so they essentially only provide precedence information. If there are any conflicts involving two operators with the same precedence, a warning should be given. |
I understand it might be nice to be able to use %nonasoc this way but it would be a hard task requiring a lot of work. You'll have to provide a real-world compelling example to warrant even considering this as a feature. |
Very well. Troll (http://topps.diku.dk/~torbenm/troll.msp) is a language for randomly rolling or calculating probabilities for dice rolls. It has a multitude of infix operators, including ".." that given two numbers give a collection of all numbers in the interval. For example, "2 .. 5" gives the collection "2 3 4 5". It is non-associative because, for example, "2 .. 5 .. 7" does not make sense. I use, in mosmlyac, the %nonassoc operator to handle this. This makes it possible to use only one nonterminal for expressions. Otherwise, I would need three: One for operators that bind stronger than "..", one for ".." and one for operators and control structures that bind weaker than "..". You can download the code (in Moscow ML) for Troll including the parser specification at http://www.diku.dk/~torbenm/Troll/ Note that mosmlyac generates the parser without any conflicts. When porting Troll to F#, the same grammar specification generates 195 reduce/reduce conflicts (related to issue #40) and one shift-reduce conflict for the ".." operator. Also, I can't see why it should require a lot of work to make %nonassoc work as I suggest: You simply remove both actions when there is a shift/reduce conflict involving a non-associative operator. This is only marginally more work than removing one of these when there is a shift/reduce conflict involving a left- or right-associative operator. If you are worried about users not expecting this behaviour, you can issue a warning that these actions are removed. |
Could you provide .output file from momlyacc? |
Sure. |
I've done a bit of further digging and it seems you were right. I overestimated the amount of changes this would require. Mostly perhaps due to my earlier "failed" experiments with this. Now I think I found the reason for it. Strangely enough there is some code in fsyacc that tries to undo parsing errors by forcing them to be reductions (or accepts). It seems either wrong or it's a hack around something or I'm missing a point. I'll investigate that a bit more. I'll follow on this issue in #51 |
Thanks. The odd behaviour at parsing errors might be a partial attempt to add error recovery (which will allow multiple syntax errors to be reported in one parse). A common strategy for this is to reduce and then skip input to the next symbol that is in the follow set of the reduced-to nonterminal. |
Makes sense I guess. Firstly we cannot throw an exception on a wrongly used nonassoc if we try to recuperate. I'd say we should avoid doing that. In the #51 I've |
Even with error recovery, an error should be reported, but afterwards the parsing can continue. So there is no case of hiding the error. AFAIK, fsyacc does not allow parsing to continue after an error, so something needs to be added to support error recovery. In any case, error recovery should be optional. You may want to revert to an editor/IDE after the first error, and many people do not like the "false errors" that follow a recovery (since recovery is seldom perfect). As for %nonassoc (as well as %left and %right), a removed action should (at run time) work like it was never there, i.e., give an error. Recovery is a separate issue. |
%nonassoc is not handled correctly: When an operator (say, NEQ) is declared nonassociative, there should be no shift/reduce conflict on a state containing items
Instead, BOTH the shift and the reduce actions should be eliminated if the next symbol is NEQ. This will cause (correctly) a parse error on, say,
Exp NEQ Exp NEQ Exp. As it is, it parses equivalently to Exp NEQ (Exp NEQ Exp).
A very simple grammar file that shows the problem is
%token ONE NEQ
%nonassoc NEQ
%start Exp
%type Exp
%%
Exp : ONE {1}
| Exp NEQ Exp {if $1 = $3 then 0 else 1}
%%
Peter Sestoft tells me that he corrected this error (and one having to do with reduce/reduce conflicts) in mosmlyac (which is also based on ocamlyacc) about 10 years ago.
The text was updated successfully, but these errors were encountered: