This project focuses on classifying tweets into positive or negative sentiments. We use the Sentiment140 dataset for training and testing our model, displaying the results using bar graphs and pie charts.
- Overview
- Dataset
- Dependencies
- Exploratory Data Analysis
- Data Preprocessing
- Model Training and Evaluation
- Results
- Conclusion
- Usage
The goal of this project is to classify tweets into positive or negative sentiments using the Sentiment140 dataset. It contains 1,600,000 tweets extracted using the twitter api. We perform data visualization to understand the distribution of sentiments and apply machine learning techniques to build a classification model.
The Sentiment140 dataset includes the following fields:
target
: Polarity of the tweet (0 = negative π, 1 = positive π)ids
: Unique id of the tweetdate
: Date of the tweetflag
: Query (if no query, it's NO_QUERY)user
: Name of the user who tweetedtext
: Text of the tweet
To run this project, you'll need the following Python packages:
import pandas as pd
import matplotlib.pyplot as plt
import nltk
from nltk.corpus import stopwords
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import BernoulliNB
import tweepy
import warnings
We start by loading and exploring the dataset to understand the distribution of sentiments and other features.
We clean and preprocess the data to prepare it for model training.
We train a Bernaulli Naive Bayes model and evaluate its performance.
We visualize the results using bar graphs and pie charts.
Our model achieves good accuracy in classifying tweets into positive and negative sentiments. Further improvements can be made by exploring other machine learning algorithms and fine-tuning the preprocessing steps.
To run the project:
- Clone the repository:
git clone <[repository-url](https://github.com/Devubavariaa/TWITTER-SENTIMENTAL-ANAYLSIS)>
- Navigate to the project directory:
cd twitter-sentiment-analysis
- Install dependencies:
pip install -r requirements.txt
- Run the analysis:
python analysis.py