-
-
Notifications
You must be signed in to change notification settings - Fork 424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add EOF symbol to match end of input #237
Comments
Any update? |
I have a version that should be working for LALR, in branch If anyone wants to test it and let me know how it goes, that would be great! (that's https://github.com/lark-parser/lark/tree/end_symbol) |
I did a rebase of this onto the master branch. There was a conflict in https://github.com/jisaacstone/lark/blob/end_symbol/tests/test_parser.py#L1650-L1660 I but all the tests seem to pass. To be sure I added another test in and it failed https://github.com/jisaacstone/lark/blob/end_symbol/tests/test_parser.py#L1650-L1660
I need to dive deep to get a better understanding of what is going on here exactly, but meanwhile does anything look off about the code I linked or the test I wrote? |
The test seems fine. I think the exception happens because some of the new preprocessing code isn't aware that I'll look into it soon, unless you'll manage to work it out sooner. |
I've done a bit more digging 2 things that may or may not be relevant 1st: above test fails with the same error on the original branch (before rebase) 2nd: Altering the test slightly produces a different error: @unittest.skipIf(PARSER!='lalr', "Using the end symbol currently works for LALR only")
def test_end_symbol2(self):
grammar = """
start: (a|b)+
a: "a" (E|ND)
b: "b"
E: $
ND: "x"
"""
parser = _Lark(grammar)
self.assertEqual(parser.parse('axa'), Tree('start', [Tree('a', []),Tree('a', [])]))
self.assertRaises(UnexpectedInput, parser.parse, 'ab')
In the rebase case this fails in a different way, because an assert has been added https://github.com/lark-parser/lark/blob/master/lark/load_grammar.py#L637 removing the assert and it fails in the same way. Hope these findings are helpful. About to go vacation so probably won't get back to looking at this for a couple weeks at least |
Hi @jisaacstone, I just created a new branch, I don't get the same exceptions as you. In fact, everything works perfectly for me. I also added the two tests you outline here (renamed the 2nd version to Let me know if you think I missed something. Otherwise, all you have left to do, is to get it to work for Earley. Enjoy your vacation! |
OK Looks like my original thought was correct and this section of code i did not resolve the conflicts correctly https://github.com/jisaacstone/lark/blob/end_symbol/tests/test_parser.py#L1650-L1660 Should ought to have gone back to that when I hit trouble. Sorry for the misdirection |
+1 for $ to represent end of string. I just ran into a need for this. |
Just pushed this into Here's an example from the tests that shows what it does: def test_end_symbol2(self):
grammar = """
start: (a|b)+
a: "a" ("x"|$)
b: "b"
"""
parser = _Lark(grammar)
self.assertEqual(parser.parse('axa'), Tree('start', [Tree('a', []),Tree('a', [])]))
self.assertRaises(UnexpectedInput, parser.parse, 'ab') |
I've just checked
I don't claim to be even remotely be an expert on parsing but not being able to match For instance (simplified):
whereas |
Looks like you reverted in 3bee210 but you don't say why. |
If I remember correctly, there were some problems with unexpected side effects.
Maybe, but your example is wrong:
works. |
So it does... again, not an expert, but that's very non intuitive. Thanks for the help! |
@kneufeld I reverted it because it was buggy. I agree it would be a nice feature, but I don't think it's a necessity. |
Is there any update? I kind of need this. Or is there a workaround? |
I'll see what I can do |
@ThatXliner See PR #880 |
Can be signified with
$
Must be optional (grammars don't have to contain it to be correct)
Need to make sure it works for both LALR and Earley, and with indentation.
The text was updated successfully, but these errors were encountered: