TFMR

This repository provides a modified implementation of the decoder component of the architecture described in the paper: Attention is all you need [1]. In addition, example usage is provided by means of a language model tranined on a small baking cookbook.

The architecture implemented in this repository is similiar to that described in the paper: Improving Language Understanding by Generative Pre-Training [2], in that it can be thought of as a standalone decoder component from [1] that omits the attention layer which interfaces with an encoder. However, whereas the implementation in [2] makes some additional architectural changes, this implementation stays true to the original description in [1].

Figure 1: Left: The original transformer from [1], Middle: The implementation in this repository, Right: OpenAI's transformer from [2].

What's in the box?

Keras/Tensorflow implementation of the architecture in Figure 1.
Language model trained on a small baking cookbook.

Quickstart

git clone [email protected]:coxy1989/superconv.git
cd tfmr
conda env create -f environment.yml
source activate tfmr
python modules/language_model.py

References

[1] Vaswani et al. Attention Is All You Need. arXiv:1706.03762, 2017.

[2] Radford et al. Improving Language Understanding by Generative Pre-Training. OpenAI, 2018.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.circleci		.circleci
data		data
img		img
modules		modules
tests		tests
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
gpu-environment.yml		gpu-environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TFMR

What's in the box?

Quickstart

References

About

Releases

Packages

Languages

coxy1989/tfmr

Folders and files

Latest commit

History

Repository files navigation

TFMR

What's in the box?

Quickstart

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages