Experimental prototypes based on the dataset produced by the Nineteenth-Century Knowledge Project led by Peter M. Logan.
To reproduce the POC from this repository and the corpus.
- create a new folder poc
- clone this repository into poc/eb-pre
- clone the Encyclopedia repository in a separate folder poc/kp-editions
cd poc/eb-pre/data
ln -s ../../kp-editions
And remove superseded copies of the encyclopedia entries:
rm -rf kp-editions/eb07/TXT/ kp-editions/eb07/XML/
cd poc/eb-pre
python3 -m venv venv
source venv/bin/activate
pip install -U pip
pip install build/requirements.txt
cd poc/eb-pre/tools
rm ../data/index.json
python prep.py
cd poc/eb-pre/tools
rm ../data/semantic_search/*
python classify.py
python compress.py ../data/semantic_search/semantic_search-edition_7-doc2vec-learn-mc_40-ng_1-tm_0.5-ch_sentence.tv2.json 2
cd poc/eb-pre
python3 -m http.server 8000
- visit the following URL with your browser: http://localhost:8000/docs/