Experimenting with word2vec
The goal of this code is to explore word-embeddings learned via Word2Vec with Gensim (in Python 3).
This is an introductory, exploratory jupyter notebook to get to know Word2Vec and some text analytic concepts/tools. For a more rigorous tutorial and application of Word2Vec, please see my repo on how words are loaded with meaning.
This jupyter notebook explores
- Upload and Clean/Preprocess Data to feed into word2Vec
- Learn Word2Vec Word Embeddings
- Explore Word2Vec Word Embeddings (3A. Word Similarities & Robustness; 3B. Visualization with t-SNE & Robustness)
- Word Embeddings as Features to Predict Author Gender
- The Issue of Polysemy (words with more than one meaning, or "sense", like a river "bank" v.s. financial "bank")