Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent results when changing R/number of Top Items displayed in bar chart and .topic_info #185

Open
chuachinhon opened this issue Dec 8, 2020 · 0 comments

Comments

@chuachinhon
Copy link

Hi there, I've been trying to change the number of top keywords displayed in the bar chart, from default 30 to 50 (or any other value), and getting some unusual results. Wonder if this is a known bug, or I'm doing something wrong:

When I try to lower the number of Top Most Relevant Terms shown from default 30 to, say, 20 or 10, this works fine in the bar chart:
panel = pyLDAvis.sklearn.prepare(
best_lda_model, corpus_vectorized, vectorizer, mds="tsne", R=20, sort_topics=False
)

By changing R, the bar chart and its title indeed shows 20 top items and their count.

The panel.topic_info DF, however, would show Top 20 keywords for the entire corpus, but about 30+ keywords for each topic.

When I raise R/the number of Top Most Relevant Terms shown from default 30 to, say, 40 or 50, the bar chart and its title would only show the results for the default Top 30 items, instead of Top-40 or Top-50.

When I inspect the panel.topic_info DF for R=50, I do get the top 50 keywords for the entire corpus. But I would get 80+ top keywords for Topic1 and 100+ for another topic.

Are the inconsistencies the result of me setting the R value wrongly, or because of different versions of pandas or other libraries?

I'm using the latest version of pyLDAvis 2.1.2 (sklearn version) on Python 3.6 and 3.7.

Appreciate any help or advice on this. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant