Basic Steps to train a text classifier

Step1, Preprocessing the article titles, tokenization/lemmatization, etc

Step2, train a classifier (MultiNB, LinearSVC, RFC, etc)

Step3, compare the best score/accuracy classifier

Step4, how to tame the hyperparameter for the classifier, could use 10-fold cross validation. Use validation dataset to tame the hyperparameter.

Step5, predict the new data samples

Provide feedback