You're Toxic, I'm Slippin' Under: Toxic Comment Classification Challenge

In digital communities and forums on the internet, users often choose to remain anonymous as real names are not required when conversing with strangers online. With this anonymity comes the freedom to express one's thoughts without fear of being judged or recognized, yet this might also mean that users are free to say abusive sentiments with little to no repercussions. While most online forums and social media sites have various ways to moderate (e.g. moderators and staff that manually review posts and comments, a report button under messages, voting for comments and posts), these methods are not enough to combat the significant number of toxic comments made.

With this, ways to automate checking for toxicity in online text should be improved to foster a safe and respectful online environment.

The Toxic Comment Classification Challenge is a Kaggle challenge by the Conversation AI team, which is composed of researchers from both Jigsaw and Google. This challenge invites participants to build a multi-headed model that can accurately detect the types of toxicity (i.e.,toxic, severe toxic, obscene, threat, insult, and identity hate) better than Perspective’s current models. Thus, the dataset given contains a large number of Wikipedia comments which have been labeled by human raters for toxic behavior.

This project's best model received a private ROC AUC score of 0.97559, and public ROC AUC score of 0.97622.

Project Files and Folders

This Github Repository contains three folders, and two main files.

Folders

Folders	Description
`cleaned_data`	Folder that holds the cleaned version of the train and test data
`data`	Folder that holds the original data from the Kaggle challenge
`results`	Folder that holds the prediction of the different algorithms tried

Jupyter Notebooks

Files	Description
`ToxicComment_S13_Group8.ipynb`	Main notebook that also holds the Data Cleaning and Pre-processing, and EDA
`ToxicComment_S13_Group8_Supplementary.ipynb`	Other solutions tried to solve the Challenge

How to set up and run the project locally through JupyterNotebook or JupyterLab

Extract the folder from the zipped file that you can download through this DownGit link.
Launch Jupyter notebook or JupyterLab.
Navigate to the project folder containing ToxicComment_S13_Group8.ipynb.
Open ToxicComment_S13_Group8.ipynb.

Authors

Francheska Josefa Vicente
[email protected]
Sophia Danielle S. Vista
[email protected]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

You're Toxic, I'm Slippin' Under: Toxic Comment Classification Challenge

Project Files and Folders

Folders

Jupyter Notebooks

How to set up and run the project locally through JupyterNotebook or JupyterLab

Authors

About

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 166 Commits
.ipynb_checkpoints		.ipynb_checkpoints
cleaned_data		cleaned_data
data		data
results		results
README.md		README.md
ToxicComment_S13_Group8.ipynb		ToxicComment_S13_Group8.ipynb
ToxicComment_S13_Group8_Supplementary.ipynb		ToxicComment_S13_Group8_Supplementary.ipynb

francheska-vicente/stintsy-project

Folders and files

Latest commit

History

Repository files navigation

You're Toxic, I'm Slippin' Under: Toxic Comment Classification Challenge

Project Files and Folders

Folders

Jupyter Notebooks

How to set up and run the project locally through JupyterNotebook or JupyterLab

Authors

About

Topics

Resources

Stars

Watchers

Forks

Contributors 2

Languages