A neural network to filter out valid/legit messages from incoming spams. Might work on deployment soon...
Stuff used: SpamSMS dataset from kaggle Data cleaning and preprocessing with 're' and 'NLTK'
- Lemmatization since I think it should work better thank Porter Stemming.
- Bag of Words to convert to vectors as TF-IFD made all the values exteremly small, leading to majority of words being labelled 0 when they should be 1.