Machine Learning Algorithm Toolbox
- Different shallow and deep learning algorithms for text classification
data
|--- classif_data.csv/tsv/txt
src
|--- DataLoader.py
|--- Models.py
|--- Trainer.py
|--- Inference.py
|--- FeatureExtractor.py
main.py
Central to any ML system are three key things:
- data, on which model will be trained;
- features, the representation of data that will be the input to the model; and
- algorithm (or model itself), which is going to be trained
A simple pipeline of any ML project can be defined as:
- Prepare your data - split them into train and test sets. We'll do this using
DataLoader.py
- Represent your data - extract features or embed your data, can also be considered the pre-processing step. We'll do this usinf
Extractor.py
- Train the model. Will be done in
Trainer.py
- Predict using the model. Will be done using
Inference.py
main.py
will be a high-level wrapper to call different classes at a single place.