This is a machine learning dataset containing two versions of the Bible. The purpose of this is to experiment with "translating" modern English to 17th-century English and visa-versa.
The dataset is roughly 31K verse pairs split 80/20 into training/evaluation. These files are available as either CSV or JSONL.