From a41f0850cbb4241bb6cddd0cf003801f19921bf4 Mon Sep 17 00:00:00 2001 From: Andrew Moore Date: Fri, 4 Aug 2017 06:25:35 +0100 Subject: [PATCH] Update of install instructions. --- README.md | 24 +++++------------------- results/best_aspect_clf_results.tsv | 4 ++-- results/best_clf_results.tsv | 4 ++-- 3 files changed, 9 insertions(+), 23 deletions(-) diff --git a/README.md b/README.md index 6b7ecbd..dd84de0 100644 --- a/README.md +++ b/README.md @@ -52,7 +52,7 @@ written. ## LSTM's There are two LSTM's both sub class [LSTMModel](./lstms/LSTMModel.py). Note that -in the paper the standard LSTM is called the Tweeked LSTM in the code base sorry for any +in the paper the standard LSTM is called the Tweeked LSTM in the code base sorry for any confusion. [Early Stopping LSTM](./lstms/EarlyStoppingLSTM.py) as the name suggests does not have a set number of times @@ -77,28 +77,14 @@ more relevant this was used as it appeared to work well for this task. Require: 1. Python 3.4.3 or above. +2. graphviz -If you would like to visualise the LSTM's then GraphViz is required for Debian based -systems this can be installed using: - -apt-get install graphviz - -### Note on [Unitok-3.0.3](./unitok-3.0.3) - -I have included unitok-3.0.3 within this project as this project requires a Python 3 -version and the one currently [available](http://corpus.tools/wiki/Unitok) is -Python 2 only therefore this version is Python 3 only for English. - -To install go to [Unitok-3.0.3](./unitok-3.0.3) folder and run: - -python3 setup.py install - -### All of the other pips - -All the other pips can be installed using the following command: +And the installation of pip's: pip3 install -r requirements.txt +Also look at the [config file](./config.yml) to see where to put the data. + ## [Final output](./final_output) diff --git a/results/best_aspect_clf_results.tsv b/results/best_aspect_clf_results.tsv index 9ff6eb3..4d26a87 100644 --- a/results/best_aspect_clf_results.tsv +++ b/results/best_aspect_clf_results.tsv @@ -1,2 +1,2 @@ -Mean SD union__ngrams__posextract__expand union__ngrams__posextract__replacement union__ngrams__negextract__replacement union__ngrams__negextract__expand union__ngrams__count_grams__binary union__ngrams__posextract__expand_top_n clf__C union__ngrams__negextract__expand_top_n union__ngrams__text_extract__feature union__ngrams__tokeniser__tokeniser_func union__ngrams__compextract__expand union__ngrams__negextract__words_replace union__ngrams__compextract__words_replace clf__epsilon union__target_extract__count_grams__binary union__ngrams__posextract__words_replace union__ngrams__compextract__replacement union__ngrams__tokeniser__ngram_range union__target_extract__aspect__feature -0.617100347856 0.0460432874772 Word2Vec(vocab=38074, size=300, alpha=0.025) posword negword Word2Vec(vocab=38074, size=300, alpha=0.025) True 10 0.1 10 text unitok_tokens None Poor word train companies 0.01 True Excellent word companyname (1, 2) aspects +Mean SD union__ngrams__negextract__expand_top_n union__target_extract__count_grams__binary union__ngrams__compextract__replacement union__ngrams__tokeniser__tokeniser_func union__ngrams__negextract__replacement clf__epsilon union__ngrams__compextract__words_replace union__ngrams__text_extract__feature union__ngrams__posextract__replacement union__ngrams__count_grams__binary union__ngrams__negextract__words_replace union__ngrams__posextract__expand_top_n union__ngrams__posextract__expand union__ngrams__posextract__words_replace union__ngrams__negextract__expand union__ngrams__tokeniser__ngram_range clf__C union__target_extract__aspect__feature union__ngrams__compextract__expand +0.617105595922 0.0460405648102 10 True companyname unitok_tokens negword 0.01 train companies text posword True Poor word 10 Word2Vec(vocab=38074, size=300, alpha=0.025) Excellent word Word2Vec(vocab=38074, size=300, alpha=0.025) (1, 2) 0.1 aspects None diff --git a/results/best_clf_results.tsv b/results/best_clf_results.tsv index f7d665f..49bf9a9 100644 --- a/results/best_clf_results.tsv +++ b/results/best_clf_results.tsv @@ -1,2 +1,2 @@ -Mean SD posextract__words_replace negextract__words_replace posextract__replacement posextract__expand negextract__expand_top_n negextract__expand negextract__replacement count_grams__binary tokeniser__tokeniser_func compextract__words_replace compextract__expand clf__C clf__epsilon compextract__replacement posextract__expand_top_n tokeniser__ngram_range -0.614559367748 0.0468039247619 Excellent word Poor word posword Word2Vec(vocab=38074, size=300, alpha=0.025) 10 Word2Vec(vocab=38074, size=300, alpha=0.025) negword True unitok_tokens train companies None 0.1 0.01 companyname 10 (1, 2) +Mean SD negextract__words_replace compextract__replacement clf__epsilon posextract__replacement compextract__words_replace posextract__expand_top_n negextract__expand_top_n posextract__expand compextract__expand negextract__expand tokeniser__ngram_range count_grams__binary negextract__replacement posextract__words_replace tokeniser__tokeniser_func clf__C +0.614564261783 0.0468012629474 Poor word companyname 0.01 posword train companies 10 10 Word2Vec(vocab=38074, size=300, alpha=0.025) None Word2Vec(vocab=38074, size=300, alpha=0.025) (1, 2) True negword Excellent word unitok_tokens 0.1