This is mainly inspired from abhishek thakur's book. This toolkit helps me in day to day life at work and now I am making it automated. The idea is to make a robust automated ML pipeline which helps in rapid development of machine-learning models. I will also opensource my data-analysis library neo
in later iterations.
I will focus more on how to add business-value and less on getting highly-accurate models. I will share my insight-generation process as part of it. Any baseline models should not take more than 30 minutes..
Below I am sharing my iterations plan.
- Iteration 1 : Basic Framework -done
- Iteration 2 : Make cross-validation more robust -done
- Iteration 3 : Add more number of ways to handel categorical vars -done
- Iteration 4 : Add Basic Visualization -done
- Iteration 5 : Add Iterations/Experiment Tracking
- Iteration 6 : Add feature selection
- Iteration 7 : Revisit the design and structure
- Iteration 8 : Add support for custom data-analyis library
- Iteration 9 : Add pipeline support
- Iteration 10: Release v1
Please feel free to send PR.