Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix charclass parser to ignore whitespace and quote backslash #34

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mkurkov
Copy link
Contributor

@mkurkov mkurkov commented Jun 19, 2015

Hi, working on Javasript PEG grammar for neotoma, I found that it is not very convinient to use charclasses - it uses whitespace and doesn't quote backslashes, so I did this patch to fix it.
PEG ignores whitespace everywhere so I thought it is right thing to do in charclasses too and this makes JS grammar whith pretty big charclasses more readable. Like this:
Before:

UnicodeLetter   <- [\\p{Lu}\\p{Ll}\\p{Lt}\\p{Lm}\\p{Lo}\\p{Nl}];

After:

UnicodeLetter   <- [ \p{Lu} 
                     \p{Ll}
                     \p{Lt}
                     \p{Lm} 
                     \p{Lo} 
                     \p{Nl} ];

@bookshelfdave
Copy link
Contributor

Travis doesn't seem to like R14

@mkurkov
Copy link
Contributor Author

mkurkov commented Jun 23, 2015

Looks like this is due to outdated travis config. #35 should fix this.

@bookshelfdave
Copy link
Contributor

It would be nice to have the wiki reflect this change too.

@seancribbs
Copy link
Owner

@metadave @mkurkov I'm not sure I will accept this change yet. It seems innocuous but might break other users' grammars.

@mkurkov mkurkov force-pushed the fix/charclass-backslash-quote branch from ba759a9 to 6a98c09 Compare June 23, 2015 14:48
@mkurkov
Copy link
Contributor Author

mkurkov commented Jun 23, 2015

@seancribbs Well, I see, it is not backward compatible. I of course can rewrite grammar so charclasses will not contain whitespaces and newlines, but I think this change in line with PEG syntax. Maybe we can have it in version 2.0, I see you are working on it in separate branch?

@seancribbs
Copy link
Owner

@mkurkov Right, and sadly that's really far off, given the huge task I have taken on there. In an ideal world, the character class will be compiled out to a case expression rather than using re internally -- unless I find that re has better and more predictable performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants