Skip to content

Latest commit

 

History

History
71 lines (39 loc) · 1.62 KB

README.md

File metadata and controls

71 lines (39 loc) · 1.62 KB

Image pre processing for Tess5-hw-training.

This script used to pre process the hand written image and get groung truth to train tesseract model to increase the accuracy of Tesseract OCR engine.

clone

▶ git clone https://github.com/vigneshkannan255/Tess5-hw-training.git

Required Python Packages

Python packages installation

▶ pip3 install Image
▶ pip3 install opencv-python

Required Linux Packages

tesseract-ocr Installation

▶ sudo apt install tesseract-ocr

tesseract-ocr Tamil Language Installation

▶ sudo apt install tesseract-ocr-tam

Process

  • Orginal to grayscale.
  • Contrast Increase.
  • Brightness Increase.
  • Cropping Images line by line.
  • Generating Negative Image from cropped Image.
  • Generating ground truth from cropped Image.
  • Generating ground truth from Negative Image.

Sample Input Image.

Preprocessed Image.

Sample Cropped Image.

Negative Image.