Language Modeling with Gated Convolutional Networks (Dauphin et al., 2017)

Example usage

First download and preprocess the data following the main language modeling README.

Then to train a convolutional LM using the fconv_lm_dauphin_wikitext103 architecture:

fairseq-train --task language_modeling \
    data-bin/wikitext-103 \
    --save-dir checkpoints/fconv_wikitext-103 \
    --arch fconv_lm_dauphin_wikitext103 \
    --adaptive-softmax-cutoff 10000,20000,200000 \
    --dropout 0.2 \
    --criterion adaptive_loss \
    --optimizer nag --clip-norm 0.1 --weight-decay 5e-06 \
    --lr 1.0 --lr-scheduler reduce_lr_on_plateau --lr-shrink 0.5 \
    --max-tokens 1024 --tokens-per-sample 1024 \
    --ddp-backend legacy_ddp \
    --max-epoch 35

And evaluate with:

fairseq-eval-lm data-bin/wikitext-103 --path checkpoints/fconv_wiki103/checkpoint_best.pt

Citation

@inproceedings{dauphin2017language,
  title={Language Modeling with Gated Convolutional Networks},
  author={Dauphin, Yann N and Fan, Angela and Auli, Michael and Grangier, David},
  booktitle={Proceedings of the 34th International Conference on Machine Learning-Volume 70},
  pages={933--941},
  year={2017},
  organization={JMLR}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.conv.md

README.conv.md

Language Modeling with Gated Convolutional Networks (Dauphin et al., 2017)

Example usage

Citation

Files

README.conv.md

Latest commit

History

README.conv.md

File metadata and controls

Language Modeling with Gated Convolutional Networks (Dauphin et al., 2017)

Example usage

Citation