ZeroDivisionError: division by zero #274

ArpiJakab · 2017-06-13T22:33:34Z

wordcloud = WordCloud().generate('not funny, funny,')

Generates error:
wordcloud = WordCloud().generate('not funny, funny,')
File "/Library/Python/2.7/site-packages/wordcloud/wordcloud.py", line 556, in generate
return self.generate_from_text(text)
File "/Library/Python/2.7/site-packages/wordcloud/wordcloud.py", line 541, in generate_from_text
words = self.process_text(text)
File "/Library/Python/2.7/site-packages/wordcloud/wordcloud.py", line 522, in process_text
word_counts = unigrams_and_bigrams(words, self.normalize_plurals)
File "/Library/Python/2.7/site-packages/wordcloud/tokenization.py", line 57, in unigrams_and_bigrams
if score(count, counts[word1], counts[word2], n_words) > 30:
File "/Library/Python/2.7/site-packages/wordcloud/tokenization.py", line 22, in score
p2 = (c2 - c12) / (N - c1)
ZeroDivisionError: division by zero

amueller · 2017-06-15T20:49:49Z

Thanks, I can reproduce. Not sure what a good fix is. Did you get this in a real usecase?
I "fixed" it such that "funny funny" is now not detected as a collocation, and the output will be a word-cloud containing "funny", not "funny funny".

ArpiJakab · 2017-06-16T00:29:35Z

Hi Andreas, thank you for responding quickly. My data includes a comma separated set of monogram and bigram sentiments like "funny", "not funny", "bad", "never good" etc... The word cloud is only shows the second word of the bigram. For example "not funny" is the most common sentiment, although the cloud only shows "funny". I've tried changing all "not funny" to "not-funny" although no change. I reduced the data set to a single line and that's when I hit the divide by zero error. - Arpi

…

On Jun 15, 2017, at 1:49 PM, Andreas Mueller ***@***.***> wrote: Thanks, I can reproduce. Not sure what a good fix is. Did you get this in a real usecase? I "fixed" it such that "funny funny" is now not detected as a collocation, and the output will be a word-cloud containing "funny", not "funny funny". — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

amueller · 2017-06-16T14:44:17Z

not is removed because it's a stop-word and English stopwords are removed: https://github.com/amueller/word_cloud/blob/master/wordcloud/wordcloud.py#L195

If you already did tokenization, I think it's better to call generate_from_frequencies, to circumvent the tokenzation and normalization in WordCloud. You need to count the occurrences first, but that should be a one-line for-loop.

ArpiJakab · 2017-06-16T21:06:30Z

Great, I'll give it a go, thanks! - Arpi

…

On Jun 16, 2017, at 7:44 AM, Andreas Mueller ***@***.***> wrote: not is removed because it's a stop-word and English stopwords are removed: https://github.com/amueller/word_cloud/blob/master/wordcloud/wordcloud.py#L195 If you already did tokenization, I think it's better to call generate_from_frequencies, to circumvent the tokenzation and normalization in WordCloud. You need to count the occurrences first, but that should be a one-line for-loop. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

Prashiksha · 2022-09-09T15:48:19Z

I want to work on this issue. Getting some heads up/approval would be helpful. I am a Masters student trying to work on this open source project.

amueller · 2022-10-18T19:00:57Z

@Prashiksha go for it!

Prashiksha · 2022-12-04T15:21:43Z

I have found the solution to this issue. I want to resolve this issue and want it to be closed. Shall I modify the changes in API or how you want me to incorporate the change?
Seeking for some help @amueller

amueller added bug Need Contributor labels Apr 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ZeroDivisionError: division by zero #274

ZeroDivisionError: division by zero #274

ArpiJakab commented Jun 13, 2017

amueller commented Jun 15, 2017 •

edited

Loading

ArpiJakab commented Jun 16, 2017 via email

amueller commented Jun 16, 2017

ArpiJakab commented Jun 16, 2017 via email

Prashiksha commented Sep 9, 2022

amueller commented Oct 18, 2022

Prashiksha commented Dec 4, 2022

ZeroDivisionError: division by zero #274

ZeroDivisionError: division by zero #274

Comments

ArpiJakab commented Jun 13, 2017

amueller commented Jun 15, 2017 • edited Loading

ArpiJakab commented Jun 16, 2017 via email

amueller commented Jun 16, 2017

ArpiJakab commented Jun 16, 2017 via email

Prashiksha commented Sep 9, 2022

amueller commented Oct 18, 2022

Prashiksha commented Dec 4, 2022

amueller commented Jun 15, 2017 •

edited

Loading