Improve speed of KWIC results return time #60

stephbuon · 2022-02-11T00:02:32Z

Go to Language/Word Context. Then select "similarity" from the measure drop down. Then search for a word in the corpus. The app will return a scatter plot for word most associated to the search word (according to word2vec and cosign similarity. see line 102). If you click on one of those scatter plot points and wait for ~9 seconds a data frame will pop up with the word's keyword in context (KWIC).

Obviously, it's a problem that it takes ~9 seconds for results to return. Can we optimize KWIC so it returns results in a reasonable amount of time?

Here's the KWIC code:
https://github.com/stephbuon/hansard-shiny/tree/main/app/modules/kwic

It's called by: https://github.com/stephbuon/hansard-shiny/blob/main/app/modules/word-context/word_context.R

Caching the results (kwick_cache.R) obviously allows us to return results in real time, however, I don't know if we would generate too much cache.

You'll see that I am borrowing a function from Quanteda (this one: https://quanteda.io/reference/kwic.html)

stephbuon · 2022-03-07T20:00:02Z

@EliasLMann here is another first problem you can work on if you do not want to work on Log Likelihood.

stephbuon added optimization good first problem labels Feb 11, 2022

stephbuon assigned EliasLMann Feb 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve speed of KWIC results return time #60

Improve speed of KWIC results return time #60

stephbuon commented Feb 11, 2022

stephbuon commented Mar 7, 2022

Improve speed of KWIC results return time #60

Improve speed of KWIC results return time #60

Comments

stephbuon commented Feb 11, 2022

stephbuon commented Mar 7, 2022