-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WordCloud filters words regardless of passed regex #266
Comments
Hey. There is a (stalled) PR here: #215 I think it would be cool if we could allow this without adding yet another option. So the question is if it's feasible to reproduce the current behavior by modifying the regexp and removing the hard-coded isnumber. Any input welcome! |
My apologies, I missed
|
That excludes |
Yes, my mistake, I should run tests. I'll work on a better regex. |
cool thanks :) |
I've been struggling for a while with this and can't find a regex that works in the same way as the current one whilst not matching strings that are entirely numbers. |
hm... I guess we do need to add another boolean option... |
Maybe it would be worth negating these two lines if a custom regex is used:
That makes the most sense to me anyway but would slightly change the functionality. |
but a custom regex doesn't necessarily mean that people don't want plurals removed. |
I guess, although this whole issue caught my attention because the class was not respecting my custom regex and I had to find the normalise plurals flag. (Also it does not in fact remove plurals but possesives, which was also a little confusing) |
It removes plurals and possessives, so it might be a misnomer. We can rename it to |
I understand what you mean. |
If the optional
regexp
argument is passed to a WordCloud class, stopwords, '\s', and numbers are still removed from the word cloud's words.This behaviour (not so much the removal of stopwords and possessives but more numbers) seems unexpected to me, but this is up for contention.
Happy to create pull request.
The text was updated successfully, but these errors were encountered: