Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extension mechanism for the parser #123

Open
Allam76 opened this issue Feb 17, 2022 · 3 comments
Open

Extension mechanism for the parser #123

Allam76 opened this issue Feb 17, 2022 · 3 comments

Comments

@Allam76
Copy link

Allam76 commented Feb 17, 2022

First of all thank you for an excellent solution!

Feature Description

Have you thought of an extension mechanism for the parser? As far as I know, yacc does not allow extending the grammar. I have a custom storage engine for go-mysql-server that require some small changes to the tokenizer and parser and it seems my only option is to fork your version of vitess?

Use Case(s)

Custom storage engine that does not follow ANSI or mysql SQL syntax.

@zachmu
Copy link
Member

zachmu commented Feb 18, 2022

You are correct that extensions with yacc grammars are more or less impossible. Your best bet would be to fork the project to make the customizations you want.

What we could do on the go-mysql-server side is make the parser pluggable the way other parts of the engine are. It's kind of a lot of work to transform the vitess AST into the go-mysql-server query plan tree though, and it would be a pretty fragile extension point.

Do you have some examples of what you're talking about that we could examine?

@Allam76
Copy link
Author

Allam76 commented Apr 8, 2022

First of all, sorry for my late reply.

One example is dealing with postgres schemas. When connecting to postgres from an external query engine, one needs a path syntax like: <db>.<schema>.<table> as table identifier. By parsing this as a path instead of a <id>.<id> one can future proof the identifiers. This is the approach taken by calcite and presto.

This comes up a lot when adapting existing DB SQL dialects and there seems to be three ways:

  1. Fork the parser for each use case and then many parallel implementations.
  2. Switch from yacc to some other grammar solution that supports extension. Calcite has that for example.
  3. Accept PRs with all sorts of changes to the parser that are then filtered out and discarded during analysis of the AST if not needed.

2 is usually considered to be the best solution but that would require you to switch parser generator. Not a very fun prospect.

ANTLR can be extended and has a GO target. Not sure how PEG for GO works.

@Allam76
Copy link
Author

Allam76 commented Apr 8, 2022

I did a quick analysis. It would not be too hard for me to translate the parser to ANTLR and keep the compatibility with the query plan tree. That would allow extension, separation of concern and a more "modern" parser generator. However, it would also break the compatibility with vitess.

In calcite, someone use this extensibility to run queries directly from a graphQL parser. 😃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants