Skip to content

rahulponnusamy/Miniproject1-SA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 

Repository files navigation

Miniproject1

Sentiment Analysis on code-mixed Tamil-English text

Abstract

The main source of our project describes our two tasks that are: Sentiment Analysis and Offensive Language Identification on Dravidian Languages (Tamil and Malayalam). These tasks are a popular task for some years. The dataset we used will contain full of code mixed text. Extracting Sentiments and finding Offensive from the sentence is a challenging task. Our first aim of this project is to clear noisy texts and removing unnecessary contents. Then we had created baseline models by training the traditional Machine Learning (ML) models, such as Support Vector Machine (SVM), Naïve Bayes Classifier, Logistic Regression, and Random Forest with feature vectors are extracted by using the TF-IDF method. Then we created Neural Network models such as Multilayer Perceptron (MLP) and Long Short-Term Memory (LSTM). Here, we had extracted features by using Word2Vec one-hot Method. The above methods are performed the same for Sentiment Analysis and Offensive Language Identification. As we will see, the Traditional machine learning algorithms are much better in performance than the Neural Network approachs with the F1 scores.

About

Sentiment Analysis on code-mixed Tamil-English text

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published