Skip to content

Releases: CPSSD/LUCAS

Sprint 3 (MVP)

10 Dec 09:56
8c4ebc5
Compare
Choose a tag to compare

Backend:

  • Investigated SVMs and logistic regression classifier much deeper
  • Finalised our statistical model, at 74.5% accuracy, using reviewer features and SVC with grid search
  • Investigated three forms of neural network architectures, FFNN, CNN and RNN, in a POC fashion
  • Cross compared the performance of the these three architectures using BOW and word2vec
  • Created custom word embeddings over our datasets using Google's word2vec (attached) and Facebook's fastText
  • Did an experiment investigating FFNN architectures with BOW and word2vec
  • Created and hosted our first neural network model (attached), a FFNN running alongside our SVM returning feature weights
  • Remodelled and revamped the wiki for documenting
  • Toyed around and read up on Grove, DCU's GPU instances that we will use next semester to train models
  • Researched deep learning and neural networks extensively and documented our research in the wiki

Frontend:

  • Got rejected by the Yelp API, however..
  • Integrated with Google Places API, and used an ensemble of Yelp Fusion and Google Places to return Google reviews
  • Set up a NoSQL OO database on our Yelp dataset to make our data queryable, allowing us pseudo-Yelp access as a backup
  • Did extensive research on data visualization and color theory, documented in the wiki
  • Implemented a word cloud indicating the most important words to a particular classification
  • Grouped best and worst classified reviews to make the result easier to read

Sprint 2

02 Dec 14:05
a4437da
Compare
Choose a tag to compare
Sprint 2 Pre-release
Pre-release

Added term weight visualisations to the webapp, along with integrated Yelp search.
Did a whole bunch more experiments on the data with multiple statistical classifiers, and cross-compared accuracies.
Did a bunch of research on the most significant papers in the opinion spam detection field and proposed a novel hypothesis for improving the cutting edge.

Sprint 1

30 Oct 08:30
b499cae
Compare
Choose a tag to compare
Sprint 1 Pre-release
Pre-release

Niall & Stefan

  • Conda environment for managing Python dependencies and versions
  • Docker environment for replication on any machine
  • Jupyter Notebook integration detailing experiments and classifier results and metrics
  • Dataset conversion into Protobuffers
  • Cross comparison and metrics of 4 classifiers: Naive Bayes, Logistic Regression, k-NN and Linear SVM
  • Feature extraction of data to replicate Stanford paper: POS, structural, sentiment
  • Understanding of Naive Bayes
  • Unit testing with Pytest and linting with Pylint
  • Python API serving a /classify endpoint using a pickled classification model to serve webapp results

Kirill:

  • Continuous Integration / Continuous Delivery pipeline via CircleCI
  • Node backend server hosted on Redbrick machine
  • ReactJS web application using Webpack, Bulma.io and Redux
  • Unit testing and integration testing using Jest and Chai
  • End-to-end communication of frontend and backend API classification model and displaying of results