Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sample Kotlin Grammar does not load properly and is unusable. #92

Open
Incoherent-Code opened this issue Jul 23, 2024 · 5 comments · Fixed by antlr/grammars-v4#4178
Open

Comments

@Incoherent-Code
Copy link

How to reproduce:

  1. Go to Antlr Lab Site
  2. Click on the sample dropdown, and select either entry for Kotlin. (I'm assuming one entry is for kotlin-formal, but neither work)
  3. Click on the tab labeled Parser.

You'll see that the Lexer is incorrectly placed in the parser, instead of the kotlin parser. The sample does not function in this state.

Even with the kotlin parser in the correct place, a solution is needed for importing UnicodeClasses.g4, which the kotlin lexer relies on.
Otherwise, the sample will throw many implicit token errors. I usually have to manually copy the contents of unicodeClasses.g4 to the end of the lexer to use the kotlin grammar with antlr lab.

@kaby76
Copy link
Collaborator

kaby76 commented Jul 23, 2024

Someone manually changed the grammars.json file. https://github.com/antlr/grammars-v4/blob/1e08bcbcc56b8ff2cfad7508815544e141d188e9/grammars.json#L2070. It's wrong and it should have been generated by script, not hand edited. https://github.com/antlr/grammars-v4/blob/master/_scripts/mkindex.py

@Incoherent-Code
Copy link
Author

Upon further inspection, this is actually a bug with mkindex.py itself.
I tried running mkindex.py again and got this output, which is still wrong:
grammar.json

@Incoherent-Code
Copy link
Author

The problem lies with lines 113 and 114:

lexer = grammars[0] if 'Lexer' in grammars[0] else grammars[1]
parser = grammars[0] if 'Parser' in grammars[0] else grammars[1]

The Kotlin pom file defines UnicodeClasses.g4 first, then the lexer and parser are listed. This edge case means that both lexer and parser are set to grammars[1].

<includes>
   <include>UnicodeClasses.g4</include>
   <include>KotlinLexer.g4</include>
   <include>KotlinParser.g4</include>
</includes>

I also noticed that kotlin-formal/pom.xml doesn't include UnicodeClasses.g4 at all, even though KotlinLexer.g4 still imports from it.

@kaby76
Copy link
Collaborator

kaby76 commented Jul 27, 2024

Upon further inspection, this is actually a bug with mkindex.py itself. I tried running mkindex.py again and got this output, which is still wrong: grammar.json

Thanks for checking this. The error is in the pom.xml itself--it has UnicodeClasses.g4 stated as a "top-level g4". Yes, it is a "lexer grammar", but it is not a "top-level g4". A "top-level g4" is a g4 that we run the tool on. UnicodeClasses.g4 is an imported file, so the tool should not be run on this file.

(And honestly, I don't understand why we are using the pom.xml for this information, when this can all be derived by trparse/trquery, or by looking at the desc.xml. The Maven tester has been replaced by trgen because trgen figures out top-level grammars, start rules, etc. When it can't, it uses the desc.xml.)

I'll need to fix the pom.xml and reindex.

@kaby76
Copy link
Collaborator

kaby76 commented Aug 9, 2024

While the parser and lexer grammar tabs fill up with the correct .g4 data, lab.antlr.org does not work with either of the kotlin grammars. It can't because the .g4's contain "import" statements, and lab.antlr.org does not implement UI for imported grammars. The mk-index script does not weed out these grammars, but it should. antlr/grammars-v4#4201

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants