Skip to content

Latest commit

 

History

History
31 lines (22 loc) · 817 Bytes

README.md

File metadata and controls

31 lines (22 loc) · 817 Bytes

ir_text

A simple Information Research python package

Simple Demo

# data and queries should be {'dataset' : lists -> [{'text' : [...], 'id' = int}]}
an_inverted_index = ir_text.InvertedIndex(data['dataset'], language = 'english')
a_linear_index = ir_text.LinearIndex(data['dataset'], language = 'english')

a_linear_index.construct()
an_inverted_index.construct(idf = True)

results_linear = a_linear_index.search(queries['queries'][0])
results_inverted = an_inverted_index.search(queries['queries'][0], ir_text.Measures.TF)

A similar but more complete example is available in ir_text_notebook.ipynb.

Package Composition

Core

Bag of Words

Measures

To Do:

  • Specify stoplist path
  • Finish notebook
  • Linear index evaluation
  • Inverted index evaluation