Skip to content

stikos/nlp

Repository files navigation

nlp

A python project for the course of Language Technology/NLP. Using the data gathered by some custom made crawlers made with scrapy for various news portals, and the nltk module, we create a vector space representation of our collection and an inverted index. Ultimately, the 2 basic functionalities are:

  • relevant article search, based on the tf-idf metric regarding the search query keywords
  • categorization of a text by looking at the top features (frequency-wise) of it's content

About

University project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages