thai2nmt: English-Thai Machine Translation Models

This repository includes code to reproduce our experiments on Thai-English NMT models and scripts to download the datasets (scb-mt-en-th-2020, mt-opus and scb-mt-en-th-2020+mt-opus) along with the train/validation/test split that we used in the experiments.

Our experiments are listed below.

Experiment #1 TBASE.SCB-1M -- Transformer BASE models trained on scb-mt-en-th-2020 v1.0
Experiment #2 TBASE.MT-OPUS -- Transformer BASE models trained on English-Thai datasets listed in Open Parallel Corpus (OPUS)
Experiment #3 TBASE.SCB-1M+MT-OPUS -- Transformer BASE models trained on English-Thai scb-mt-en-th-2020 v1.0 and datasets listed in Open Parallel Corpus (OPUS)

BibTeX entry and citation info

@Article{Lowphansirikul2021,
    author={Lowphansirikul, Lalita
            and Polpanumas, Charin
            and Rutherford, Attapol T.
            and Nutanong, Sarana},
    title={A large English--Thai parallel corpus from the web and machine-generated text},
    journal={Language Resources and Evaluation},
    year={2021},
    month={Mar},
    day={30},
    issn={1574-0218},
    doi={10.1007/s10579-021-09536-6},
    url={https://doi.org/10.1007/s10579-021-09536-6}

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
checkpoints		checkpoints
dataset		dataset
experiments		experiments
iwslt_2015/test		iwslt_2015/test
scripts		scripts
temp		temp
translation_results		translation_results
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

thai2nmt: English-Thai Machine Translation Models

About

Releases 3

Packages

Languages

License

vistec-AI/thai2nmt

Folders and files

Latest commit

History

Repository files navigation

thai2nmt: English-Thai Machine Translation Models

About

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages