Unsupervised Language Learning

This repository contains code for the Unsupervised Language Learning course offered at the University of Amsterdam

Contributors

Lab 1 - Evaluating Word Representations

Problem Statement: The goal of this practical is for you to familiarise yourselves with word representation models and different techniques for evaluating them. The word representation model that you will work with is skip-gram, trained using two kinds of context: dependency-based and word window-based. The dependency based model uses dependency annotated context to learn the word representations, as described in the paper by Omer Levy and Yoav Goldberg, Dependency-based word embeddings, published at ACL 2014.

Report
Code

Lab 2 - Learning Word Representations

Problem Statement: You will implement 3 models of word representation, one trained for maximum likelihood, and two latent variable models trained by variational in- ference. The word representation learning models that you will implement are: The Skip-gram, the Bayesian skip-gram, and Embed-Align. Skip-gram is trained discriminatively by having a central word predict context words in a window surrounding it. Bayesian skip-gram introduces stochastic latent embed- dings, but does not change the discriminative nature of the training procedure. Embed-Align introduces stochastic latent embeddings as well as a latent align- ment variable and learns by generating translation data. Eventually, you should compare the performance of these three models on the lexical substitution task.

Report
Code

Lab 3 - Evaluating Sentence Representations

Problem Statement: In the 2nd practical, you implemented and trained three different models to learn the word embeddings: The Skip-gram, the Bayesian skip-gram, and Embed-Align. You have evaluated the performance of these three models on the lexical substitution task. In this practical, your task is to compare these models using SentEval. SentEval, facebook evaluation toolkit for sentence embeddings, is a library for evaluating the quality of sentence embeddings by applying them on a broad and diverse set of downstream tasks called ”transfer” tasks. The reason they are called transfer tasks is that the sentence embeddings are not explicitly optimized on them.

Report
Code

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
Lab1		Lab1
Lab2		Lab2
Lab3		Lab3
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unsupervised Language Learning

Contributors

Lab 1 - Evaluating Word Representations

Lab 2 - Learning Word Representations

Lab 3 - Evaluating Sentence Representations

About

Releases

Packages

Contributors 2

Languages

druv022/Unsupervised-Language-Learning

Folders and files

Latest commit

History

Repository files navigation

Unsupervised Language Learning

Contributors

Lab 1 - Evaluating Word Representations

Lab 2 - Learning Word Representations

Lab 3 - Evaluating Sentence Representations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages