In this project, I've analyzed disaster data from Figure Eight and built a model for an API that classifies disaster messages.
This project consists of three parts:
process_data.py writes a data cleaning pipeline that:
- Loads the messages and categories datasets
- Merges the two datasets
- Cleans the data
- Stores it in a SQLite database
train_classifier.py writes a machine learning pipeline that:
- Loads data from the SQLite database
- Splits the dataset into training and test sets
- Builds a text processing and machine learning pipeline
- Trains and tunes a model using GridSearchCV
- Outputs results on the test set
- Exports the final model as a pickle file
- Displays visualizations about the training data using Plotly
- Runs the model on new messages that you enter yourself
- app
| - template
| |- master.html # main page of web app
| |- go.html # classification result page of web app
|- run.py # Flask file that runs app
- data
|- disaster_categories.csv # data to process
|- disaster_messages.csv # data to process
|- process_data.py
|- DisasterResponse.db # database to save clean data to
- models
|- train_classifier.py
|- classifier.pkl # saved model
- img
|- screenshot.png # screenshot of the web app
- README.md # this file