Skip to content

erickpeirson/PLSA

 
 

Repository files navigation

This is a PLSA (Probabilistic Latent Semantic Analysis) implementation for large corpora using the EM (Expectation-Maximization) algorithm and multiprocessing.

When modeling large corpora, memory consumption can become a severe bottleneck. This project addresses that problem by using PyTables.

Requirements

License

This software is available under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License (CC BY-NC-SA 3.0), allowing for any non-commercial reuse with appropriate attribution and similar licensing.

Questions?

erick [dot] peirson [at] asu [dot] edu

Acknowledgements

This project is run by Erick Peirson and the Digital Innovation Group (DigInG) in the Center for Biology at Arizona State University. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. 2011131209.

This project is based on a PLSA implementation by Liangjie Hong.

About

PLSA implementation via EM algorithm

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%