Skip to content

Basic Steps to train a text classifier

hangyang edited this page Apr 8, 2019 · 1 revision

Step1, Preprocessing the article titles, tokenization/lemmatization, etc

Step2, train a classifier (MultiNB, LinearSVC, RFC, etc)

Step3, compare the best score/accuracy classifier

Step4, how to tame the hyperparameter for the classifier, could use 10-fold cross validation. Use validation dataset to tame the hyperparameter.

Step5, predict the new data samples

Clone this wiki locally