Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keeping last token on error recovery #41

Open
thehangedman opened this issue Jan 4, 2021 · 1 comment
Open

Keeping last token on error recovery #41

thehangedman opened this issue Jan 4, 2021 · 1 comment

Comments

@thehangedman
Copy link

Hello,

I'm parsing a simple C-like language and I've got a problem with error recovery.

I provide ErrorRule for member declarations:
member_declaration.ErrorRule = SyntaxError + ";" | SyntaxError + "}"

But it doesn't work well with incomplete declarations like
{ int ; // missing name identifier }
or
{ int value // missing ; }

When parser meets ";" or "}" it starts error recovery with skipping the current ";" or "}" and looks for the next ";" or "}" instead (messing up the next declaration that ends with ";" if present, or consuming closing "}" and parsing the next top level declaration as the nested one).

It seems that it would be better if the parser wouldn't skip the current token (";" or "}" in the example above) but leave it as the current one when starting recovery - then recovery would immediately find ";" or "}" and consume it, finishing recovery successfully and not messing up further declarations (at least in the first example).

I tried to modify ErrorRecoveryParserAction the following way to test the idea:
var currentToken = context.CurrentToken; errorShiftAction.Execute(context); context.CurrentToken = currentToken;
and it feels to work well for tests above.

Or is there any better way to achieve the same result?

@rivantsov
Copy link
Contributor

you are right, error recovery is not perfect, there can be better ways to do it, like the one you suggest - maybe. You just need to verify that it works better not only for this particular case, but also for other cases in your language, for other languages and (!!!) for other 'typical' typing mistakes. I implemented error recovery based on primitive recovery methods for LALR suggested years and years ago, in all these dragon books.
The whole framework was actually built following classic books on compilers, that compiled source files on disks. The error recovery was beneficial to 'try' to recover to some consistent state to continue discover more errors; the process already failed, the only goal now was to find more errors if any, to report to programmer all errors in printout when he GETS IT IN THE MORNING from the big machine (that were the days, programs in queues to CPU on mainframes -) ).
Nowadays, we have IDEs, and compilers are heavily used in interactive editors, check syntax while user types. There are bigger challenges, we try to recover and continue discovering things down the file, not to just find more errors, but to recover 'good parts', so that intellisense continues working and showing like available methods. This is quite different from old 60s and 70s error printouts.
So in general, Irony's approach to recovery is primitive by today's standards; improving things like 'skipping better to known symbol' would not fix the general problem, it would require quite a different approach. Some day in the future
thank you
Roman

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants