Skip to content

anth0nyhak1m/nlp_project_yas

Repository files navigation

Fake News Classifier

CAPP 30254 ML Final Project

Authors:

  • Anthony Hakim
  • Sasha Filippova
  • Yifu Hou

Project Descripion:

Research Question: Can we identify fake news articles based on article title alone?

In this project, our team designed 2 Natural Language Processing (NLP) machine learning models to classify fake news articles using only article titles. For our baseline model, we use a logistic regression model and TF-IDF techniques to classify fake news articles with 94% accuracy. We also apply a pre-trained BERT model for classification, and discover that the more complex model preforms with lower accuracy.

Directory:

  • baseline_model.ipynb: TF-IDF logistic regression training and testing.
  • classification.ipynb: Final BERT model hyperparameter tuning, training and testing.
  • original_bert.ipynb: Baseline BERT model training and testing.
  • util.py: file of helper functions to preprocess data.
  • data/: directory containing data.
  • final_presentation: final presentation of results.

Data Visualization:

image

Data Source:

https://www.kaggle.com/datasets/clmentbisaillon/fake-and-real-news-dataset?select=True.csv

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published