This repository contains my approaches to apply data science to predict Survival on Titanic, an actual incident and popular learner problem on Kaggle.com The repository includes scripts for data cleanup, feature selection, strategies for data modelling, the original data sets and anaytics on the data to select appropriate ML algorithms and validate results. Code is available in both R and Python.
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.
What things you need to install the software and how to install them
Give examples
A step by step series of examples that tell you how to get a development env running
Say what the step will be
Give the example
And repeat
until finished
End with an example of getting some data out of the system or using it for a little demo
Explain how to run the automated tests for this system
Explain what these tests test and why
Give an example
Explain what these tests test and why
Give an example
Add additional notes about how to deploy this on a live system
- Dropwizard - The web framework used
- Maven - Dependency Management
- ROME - Used to generate RSS Feeds
Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.
We use SemVer for versioning. For the versions available, see the tags on this repository.
- Hasan Mujtaba - Initial work - PurpleBooth
See also the list of contributors who participated in this project.
This project is licensed under the MIT License - see the LICENSE file for details
- Hat tip to @alexeyza, @jayadeepj for solution inspiration and code organization ideas
- Inspiration : Dave Langer and his @Youtube video series on Kaggle Titanic challenge