A spam filtering application that uses classification algorithms for separating spam messages from regular ones.
-
src folder : contains the source code of the project. It consists of three packages : naive_bayes, logistic_regression, feature_selector. The first package contains the multinomial naive Bayes implementation, the second contains the implementation of the logistic regression algorithm. Finally the third package contains the source that is responsible for extracting characteristic features for spam and non-spam messages.
-
dataset folder : contains the data that are used for training and testing. The data were taken from AUEB's NLPgroup.