A script to use python for OCR using Google Vision API
-
Install Python 3
-
Install dependencies (Check : https://cloud.google.com/python/setup for step 1 and step 2)
python3 -m pip install google-cloud-vision Or pip install --upgrade google-cloud-vision
-
Follow instructions on this page : https://cloud.google.com/vision/docs/before-you-begin
- Create a Google Cloud Account Or Login with your Google ID - Create a Cloud Project - Enable billing for the project - Enable Google Cloud Vision API - Set up Authentication
Note: You may get an offer to avail USD 300 credit for usage. Accept that. It may enable you to OCR without charges for a few thousand images.
-
Download latest version of the script which has name like GoogleVisionOCR_v*. Older version are kept just to learn from errors. Don't use older versions.
-
Open the script and search for "path_to_secret_key.json" and replace it with path of your .json authentication file(which you downloaded while following instructions from 3.)
-
Save the script.
-
If needed copy it to system PATH, so that you can call it from anywhere.
-
Make the script executable.
chmod +x path_to_script
-
Windows users should check whether their terminal is using utf-8 encoding or not
chcp 65001 set PYTHONIOENCODING=utf-8
-
Now run the script as
path_to_script
-
Follow instructions in the terminal and provide path to (file/folder)