This program creates an inverted index based on a directory of input HTML files and outputs a set of files including: a directory file, a set of files containing the individual tokens in each input file, a set of files containing the TF-IDF scores for each unique token in each file, and a dictionary and postings file.
-
Notifications
You must be signed in to change notification settings - Fork 0
katherine-atwell/inverted-index
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Parses a series of input HTML files and creates an inverted index and outputs a series of files keeping track of each token in each file. Program written in Python.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published