Releases · CPSSD/LUCAS

10 Dec 09:56

Deniall

Sprint3

8c4ebc5

Sprint 3 (MVP) Latest

Latest

Backend:

Investigated SVMs and logistic regression classifier much deeper
Finalised our statistical model, at 74.5% accuracy, using reviewer features and SVC with grid search
Investigated three forms of neural network architectures, FFNN, CNN and RNN, in a POC fashion
Cross compared the performance of the these three architectures using BOW and word2vec
Created custom word embeddings over our datasets using Google's word2vec (attached) and Facebook's fastText
Did an experiment investigating FFNN architectures with BOW and word2vec
Created and hosted our first neural network model (attached), a FFNN running alongside our SVM returning feature weights
Remodelled and revamped the wiki for documenting
Toyed around and read up on Grove, DCU's GPU instances that we will use next semester to train models
Researched deep learning and neural networks extensively and documented our research in the wiki

Frontend:

Got rejected by the Yelp API, however..
Integrated with Google Places API, and used an ensemble of Yelp Fusion and Google Places to return Google reviews
Set up a NoSQL OO database on our Yelp dataset to make our data queryable, allowing us pseudo-Yelp access as a backup
Did extensive research on data visualization and color theory, documented in the wiki
Implemented a word cloud indicating the most important words to a particular classification
Grouped best and worst classified reviews to make the result easier to read

Assets 8

02 Dec 14:05

Deniall

Sprint2

a4437da

Sprint 2 Pre-release

Pre-release

Added term weight visualisations to the webapp, along with integrated Yelp search.
Did a whole bunch more experiments on the data with multiple statistical classifiers, and cross-compared accuracies.
Did a bunch of research on the most significant papers in the opinion spam detection field and proposed a novel hypothesis for improving the cutting edge.

Assets 2

30 Oct 08:30

Deniall

Sprint1

b499cae

Sprint 1 Pre-release

Pre-release

Niall & Stefan

Conda environment for managing Python dependencies and versions
Docker environment for replication on any machine
Jupyter Notebook integration detailing experiments and classifier results and metrics
Dataset conversion into Protobuffers
Cross comparison and metrics of 4 classifiers: Naive Bayes, Logistic Regression, k-NN and Linear SVM
Feature extraction of data to replicate Stanford paper: POS, structural, sentiment
Understanding of Naive Bayes
Unit testing with Pytest and linting with Pylint
Python API serving a /classify endpoint using a pickled classification model to serve webapp results

Kirill:

Continuous Integration / Continuous Delivery pipeline via CircleCI
Node backend server hosted on Redbrick machine
ReactJS web application using Webpack, Bulma.io and Redux
Unit testing and integration testing using Jest and Chai
End-to-end communication of frontend and backend API classification model and displaying of results

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: CPSSD/LUCAS

Sprint 3 (MVP)

Sprint 2

Sprint 1