Skip to content

Experiments with Seq2Seq and transformer models and English to German translation.

Notifications You must be signed in to change notification settings

thomberg1/NeuralMachineTranslation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NeuralMachineTranslation

Experiments with Seq2Seq and Transformer models and English to German translation.

This repository contains experiments with two machine translation models in pytorch. The first one is a sequence-to-sequence model with multi-head attention and the second model is a transformer as implemented in tensorflow.

The goal of the experiments is to see how the models compare in performance, how well they can be trained etc.

The English to German dataset used is from: http://www.manythings.org/ and has been cleaned and saved as TSV file. The set contains 169,197 sentences.

Both model have been run for 30 epochs. WER was used as scoring function (blue : seq2seq, red : transformer):

Alt text

Note, that both models have been trained in a similar fashion (no warm-up for the transformer) and use the same techniques for multi-head attention, label smoothing loss, token to word encoding, and scoring.

Seq2Seq with Multi-head Attention Model:

Greedy Decoding
CPU times: user 11.9 s, sys: 77.5 ms, total: 12 s
Test Summary: Bleu: 41.110, WER: 35.228, CER: 33.395 ,ACC: 17.702

Transformer Model:

Greedy Decoding
CPU times: user 43.2 s, sys: 217 ms, total: 43.4 s
Test Summary: Bleu: 43.570, WER: 35.392, CER: 33.750, ACC: 19.818

About

Experiments with Seq2Seq and transformer models and English to German translation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published