GitHub - iamkrut/Toxic-Comment-Classification: Kaggle Competition: Toxic Comment Classification Challenge - Identify and classify toxic online comments

Downloading the dataset:

Download the dataset from Kaggle - Toxic Comment Classification - https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge/data
The root directory should contain the data folder with the dataset from Kaggle.
Install the modules provided in the requirements.txt
THE data FOLDER NEEDS TO PUT INTO THE source folder extracted from source.zip.
For the project.ipynb to be able to display images, Images folder from gitlab needs be downloaded via this link - https://csil-git1.cs.surrey.sfu.ca/krutp/nlpclass-1197-g-lexchunkers/-/archive/master/nlpclass-1197-g-lexchunkers-master.zip?path=project%2FImages
The images folder should be placed inside the source folder extracted from source.zip

Here's how to run all the three models

For Logistic Regression

run python3 Log_reg/log_regression.py

For LSTM

Download crawl-300d-2M.vec and glove.840B.300d.txt. Put them in data folder
run python3 LSTM/LSTM.py

For TextCNN

Download crawl-300d-2M.vec.zip and extract it in data folder
run python3 TextCNN/textCNN.py

NOTE: If something doesn't work just clone the project directory from https://csil-git1.cs.surrey.sfu.ca/krutp/nlpclass-1197-g-lexchunkers/tree/master/project. Word embedding still would have to be downloaded separately. NOTE: Report file project.ipynb contains images so Images folder needs to be downloaded from gitlab

Checking the output files: The output.zip contains all the submission predictions generated by the three models. They should be submitted to Kaggle for evaluation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Images		Images
LSTM		LSTM
Log_reg		Log_reg
TextCNN		TextCNN
data		data
output		output
EDA.ipynb		EDA.ipynb
README.md		README.md
preprocessing.ipynb		preprocessing.ipynb
project.ipynb		project.ipynb
requirements.txt		requirements.txt
utility.py		utility.py

iamkrut/Toxic-Comment-Classification

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages