Different Types of Word Embeddings
- Frequency-based Embedding - Count Vector, TF-IDF Vector
- Prediction-based Embedding - CBOW (Continuous Bag of words), Skip – Gram model
Word Embedding Algorithms - word2vec, GloVe
Skip-gram: works well with small amount of the training data, represents well even rare words or phrases. CBOW: several times faster to train than the skip-gram, slightly better accuracy for the frequent words.
-
CBOW (Continuous Bag of words) http://www.claudiobellei.com/2018/01/07/backprop-word2vec-python/
-
Skip – Gram model http://www.claudiobellei.com/2018/01/07/backprop-word2vec-python/
https://github.com/vg11072001/NLP-with-Python/blob/master/Toxic%20Comments%20LSTM%20GloVe.ipynb
Ref - https://github.com/codebasics/nlp-tutorials https://github.com/siddiquiamir/Python-Data-Preprocessing https://medium.com/@diegoglozano/building-a-pipeline-for-nlp-b569d51db2d1 https://www.analyticsvidhya.com/blog/2017/06/word-embeddings-count-word2veec/ https://towardsdatascience.com/deep-learning-pipeline-for-natural-language-processing-nlp-c6f4074897bb https://www.kdnuggets.com/2018/04/implementing-deep-learning-methods-feature-engineering-text-data-skip-gram.html https://www.analyticsvidhya.com/blog/2021/06/practical-guide-to-word-embedding-system/?utm_source=reading_list&utm_medium=https://www.analyticsvidhya.com/blog/2017/06/word-embeddings-count-word2veec/