Web Mining project in which descriptive statistics and NLP techniques are used to analyze the behavior of a Twitter account (my personal one) and the content of their respective tweets.
All tweets collected from a specific account using the Twitter API.
In the following folder you can see the results of the previous analyzes.
The project was carried out with the latest version of Anaconda on Windows.
To install this package with conda run one of the following:
to create the virtual environment:
conda env create -f twitter-analytics-env.yml
to activate it, then run:
conda activate twitter-analytics-env
The specific Python 3.7.x libraries used are:
# Import util libraries
import tweepy
import random
import numpy as np
import pandas as pd
import yaml
import warnings
import calendar
import time
from datetime import date
from PIL import Image
from collections import Counter
# Import NLP libraries
import re
import spacy.lang.es as es
import spacy.lang.en as en
from textblob import TextBlob
from wordcloud import WordCloud
# Import plot libraries
import matplotlib.pyplot as plt
# Save tweets into MongoDB
from pymongo import MongoClient
from requests.exceptions import Timeout, SSLError, ConnectionError
from requests.packages.urllib3.exceptions import ReadTimeoutError, ProtocolError
Below, some useful and relevant links to this project:
Any kind of feedback/suggestions would be greatly appreciated (algorithm design, documentation, improvement ideas, spelling mistakes, etc...). If you want to make a contribution to the course you can do it through a PR.
- Created by Andrés Segura-Tinoco
- Created on May 24, 2020
- Updated on Aug 16, 2021
This project is licensed under the terms of the MIT license.