Sentiment Analysis - Coding Classification algorithms from scratch

This project is part of the "Machine Learning with Python - From Linear Models to Deep Learning" course offered by MIT.

Introduction

This project implements several classification algorithms for machine learning tasks. The algorithms included are:

Perceptron Algorithm: This algorithm updates classification parameters based on a single step. It iterates through the dataset multiple times to converge to an optimal solution.
Average Perceptron Algorithm: Similar to the Perceptron algorithm but computes the average of the parameters over multiple iterations, leading to a more stable model.
Pegasos Algorithm: This algorithm optimizes linear classifiers using stochastic gradient descent. It incorporates a regularization term to prevent overfitting.

Bag of Words (BoW)

Bag of Words (BoW) is a technique used in Natural Language Processing (NLP) to represent text data. It treats each document as a collection of words, ignoring grammar and word order. This representation is useful for tasks like text classification and sentiment analysis.

Code Overview

project1.py

This Python file contains implementations of the aforementioned algorithms along with helper functions for feature extraction and classification.

hinge_loss_single: Calculates the hinge loss for a single data point given classification parameters.
hinge_loss_full: Computes the average hinge loss over a dataset.
perceptron: Implements the Perceptron algorithm.
average_perceptron: Implements the Average Perceptron algorithm.
pegasos_single_step_update: Updates classification parameters using the Pegasos algorithm.
pegasos: Implements the Pegasos algorithm for optimization.
classify: Classifies data points using given parameters.
classifier_accuracy: Computes accuracy of a classifier on training and validation data.
accuracy: Computes the fraction of correct predictions.
extract_words and bag_of_words: Helper functions for BoW feature extraction.

Usage

To use these algorithms, import the necessary functions from project1.py and pass the required inputs according to each function's documentation.

For example:

from project1 import perceptron, classify, classifier_accuracy

# Load your data and feature matrices
# ...

# Train the perceptron algorithm
theta, theta_0 = perceptron(feature_matrix, labels, T=10)

# Classify new data points
predictions = classify(new_feature_matrix, theta, theta_0)

# Compute classifier accuracy
train_accuracy, val_accuracy = classifier_accuracy(perceptron, train_feature_matrix, val_feature_matrix,
                                                   train_labels, val_labels, T=10)

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.idea		.idea
__pycache__		__pycache__
200.txt		200.txt
4000.txt		4000.txt
README.md		README.md
main.py		main.py
project1.py		project1.py
read_files.py		read_files.py
reviews_submit.tsv		reviews_submit.tsv
reviews_test.tsv		reviews_test.tsv
reviews_train.tsv		reviews_train.tsv
reviews_val.tsv		reviews_val.tsv
stopwords.txt		stopwords.txt
test.py		test.py
toy_data.tsv		toy_data.tsv
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment Analysis - Coding Classification algorithms from scratch

Introduction

Bag of Words (BoW)

Code Overview

project1.py

Usage

About

Releases

Packages

Languages

dnl0037/SentimentAnalysisClassification

Folders and files

Latest commit

History

Repository files navigation

Sentiment Analysis - Coding Classification algorithms from scratch

Introduction

Bag of Words (BoW)

Code Overview

project1.py

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages