- Dataset
- Initial Stage
- About Files
- Fault
- Final Engine
- How to Configure Final Engine
- Improvement Scope
- Helpful Refrenceses
- This Initial Model is just for Comparing or for starting Without Preprocessing and Ensembling techniques.
- It tells toxicity after Analysing the Comment.
- If a comment is Toxic or Bad then it will return Toxic else it will return toxic.
- ToxityAnalysis.ipynb -> For building model
- finalized_model.pkl -> Saved ML model
- RunModel.py -> For Reloading model and analysing Comment
- app.js -> For Running nodejs app
- It take more time because it reload python script evertime on pressing Submitting the Comment.
-
Here is the engine of the model which is hosted using Flask.
-
It Resolved the Fault of Initial Stage Model.
-
It Only takes time on first time loading.
-
It Returns the Percentage instead of Class.
-
you Can See the Jupyter-notebook https://github.com/ckshitij/Comment-Toxicity-Analysis/blob/master/engine/Models/notebook.ipynb
- Here its gives the Toxicity Percentage of the Comment.
- You Can see the Difference between the Percentage of the comment
-
First Download the code directly from here in zip or clone through git.
-
Then go to the engine folder.
-
Click On the above Engine Notebook Kaggle and download all the .pkl file in Models folder.
-
Then to to the Terminal and run app.py file.
-
Then open your Web Browser and type http://localhost:5000/.
git clone https://github.com/ckshitij/Comment-Toxicity-Analysis.git cd Comment-Toxicity-Analysis cd engine python3 app.py
- You Can improve this Code by Increasing the n-gram range (1,3or4) for Word-Level and n-gram range (1,10) for Char-Level.
- You can use Deeplearning RNN (LSTM).
- You can use word level Cnn Model too.
- google Perspective
- kaggle Competition on Toxicity
- githu mediawiki-utilities
- Detecting Insults in Social Commentary Kaggle
- Content Analysis
- Abusive Language Detection in Online User Content
- Beware of Publicity! Perceived Distress of Negative Cyber Incidents and Implications for Defining Cyberbullying
- Learning part-of-speech taggers with inter-annotator agreement loss