Skip to content
/ tfmr Public

Keras/Tensorflow implementation of the decoder from the transformer as described in the paper: "Attention Is All You Need"

Notifications You must be signed in to change notification settings

coxy1989/tfmr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TFMR

CircleCI

This repository provides a modified implementation of the decoder component of the architecture described in the paper: Attention is all you need [1]. In addition, example usage is provided by means of a language model tranined on a small baking cookbook.

The architecture implemented in this repository is similiar to that described in the paper: Improving Language Understanding by Generative Pre-Training [2], in that it can be thought of as a standalone decoder component from [1] that omits the attention layer which interfaces with an encoder. However, whereas the implementation in [2] makes some additional architectural changes, this implementation stays true to the original description in [1].

Figure 1: Left: The original transformer from [1], Middle: The implementation in this repository, Right: OpenAI's transformer from [2].

What's in the box?

Quickstart

  1. git clone [email protected]:coxy1989/superconv.git

  2. cd tfmr

  3. conda env create -f environment.yml

  4. source activate tfmr

  5. python modules/language_model.py

References

[1] Vaswani et al. Attention Is All You Need. arXiv:1706.03762, 2017.

[2] Radford et al. Improving Language Understanding by Generative Pre-Training. OpenAI, 2018.

About

Keras/Tensorflow implementation of the decoder from the transformer as described in the paper: "Attention Is All You Need"

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published