This repository contains a notebook with some simple prototypes to add an attention mechanism to an LSTM for sequence labeling. It is part of a presentation for the Tensorflow Meetup Buenos Aires, in June 2018.
We also add some scripts to visualize the attention obtained after training for the Named Entity Recognition task in a portion of the 2003 CONLL dataset.
The original dataset was uploaded to Kaggle, along with a vanilla LSTM implementation. We have also hosted it into the UNC servers:
- Full dataset (150M)
- Sample dataset (14M)
There are also some trained models you can download, as it takes some time to train in the whole dataset even using a GPU:
- Without attention
- With attention - Model 1
- With attention - Model 1 - Linear
- With attention - Model 2
- With attention - Model 2 - Linear
To run the network, we recommend you to use python 3.5 and install
- Keras 2.1.5
- scikit-learn 0.19.1
- pandas 0.23.0