NeuralMachineTranslation

Experiments with Seq2Seq and Transformer models and English to German translation.

This repository contains experiments with two machine translation models in pytorch. The first one is a sequence-to-sequence model with multi-head attention and the second model is a transformer as implemented in tensorflow.

The goal of the experiments is to see how the models compare in performance, how well they can be trained etc.

The English to German dataset used is from: http://www.manythings.org/ and has been cleaned and saved as TSV file. The set contains 169,197 sentences.

Both model have been run for 30 epochs. WER was used as scoring function (blue : seq2seq, red : transformer):

Note, that both models have been trained in a similar fashion (no warm-up for the transformer) and use the same techniques for multi-head attention, label smoothing loss, token to word encoding, and scoring.

Seq2Seq with Multi-head Attention Model:

Greedy Decoding
CPU times: user 11.9 s, sys: 77.5 ms, total: 12 s
Test Summary: Bleu: 41.110, WER: 35.228, CER: 33.395 ,ACC: 17.702

Transformer Model:

Greedy Decoding
CPU times: user 43.2 s, sys: 217 ms, total: 43.4 s
Test Summary: Bleu: 43.570, WER: 35.392, CER: 33.750, ACC: 19.818

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
data		data
lib		lib
README.md		README.md
Seq2Seq.ipynb		Seq2Seq.ipynb
Transformer.ipynb		Transformer.ipynb
valid_scores.png		valid_scores.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NeuralMachineTranslation

About

Releases

Packages

Languages

thomberg1/NeuralMachineTranslation

Folders and files

Latest commit

History

Repository files navigation

NeuralMachineTranslation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages