Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quick fix for calculation of widths of red bars #41

Merged
merged 3 commits into from
Jul 19, 2015
Merged

Conversation

kshirley
Copy link
Collaborator

@kshirley kshirley commented Jul 6, 2015

This branch implements a quick fix for calculating the widths of the red bars. Here, we calculate the term frequencies internally within createJSON() rather than using the user-supplied term.frequency.

The details are described in Issue #32

The good news is that the red bar widths are correct now. The bad news is that the blue bar widths, representing the overall frequencies of words, are not necessarily correct. They won't match the actual term frequencies provided by the user. Most of the differences are small, but this is a lingering issue. As mentioned in Issue #32 the solution may be to require the user to specify the priors that he/she used in fitting the model, so that the red and blue bars can properly account for the influence of the priors as well as the raw data itself. This would require a few additional calculations, and so far it only works when I've fit the model using the collapsed Gibbs sampler -- it failed to correctly visualize a model fit using gensim, which implements variational Bayes to fit the LDA model. I'm not sure if this is the source of the error, or just a coincidence.

Anyway, in this branch I've also updated the vignette called "details" with a similar explanation as written here.

I think this is an improvement on the previous version, so we should merge it, and maybe a bit down the line we can solve the problem for good.

@@ -16,6 +16,7 @@


\begin{document}
\SweaveOpts{concordance=TRUE}

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kshirley at the top of this document, we tell the R installer to use knitr when compiling the vignette, which isn't directly compatible with Sweave.

If you want, I could switch from Rnw to Rmd, so we won't have to fiddle with this anymore...

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, devtools::build_vignettes() will mimic the "build vignette" stage that occurs during installation (without installing the package).

@cpsievert
Copy link
Owner

Go ahead and merge if you like @kshirley 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants