Optical Character Recognition on the IAM words dataset

Based on the blog post https://towardsdatascience.com/build-a-handwritten-text-recognition-system-using-tensorflow-2326a3487cd5, I use a CNN to extract features from images of words, with a bidirectional RNN that reads these features in sequence and outputs a sequence of classifications, which are compared against the ground-truth using CTC loss.

Currently I do not do any sort of data augmentation or pretraining. I do use optuna to search for hyperparameters (learning rate, number of CNN and RNN layers, kernel sizes, etc.) that perform best after 20 epochs. I then train a model using the best hyperparameters for 100 epochs, obtaining a word error rate of 16%.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
utils		utils
.gitignore		.gitignore
README.md		README.md
hparam_search.py		hparam_search.py
print_trials.py		print_trials.py
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Optical Character Recognition on the IAM words dataset

About

Releases

Packages

Languages

alex-epp/ocr

Folders and files

Latest commit

History

Repository files navigation

Optical Character Recognition on the IAM words dataset

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages