Images and annotations taken from - https://dataturks.com/projects/devika.mishra/Indian_Number_plates
Annotations include bounding boxes for each image and have the same name as the image name. You can find the example to train a model in python, by updating the api-key and model id in corresponding file. There is also a pre-processed json annotations folder that are ready payload for nanonets api.
Note: Make sure you have python and pip installed on your system if you don't visit Python pip
git clone https://github.com/NanoNets/nanonets-ocr-sample-python.git
cd nanonets-ocr-sample-python
sudo pip install requests tqdm
Get your free API Key from http://app.nanonets.com/#/keys
export NANONETS_API_KEY=YOUR_API_KEY_GOES_HERE
python ./code/create-model.py
_Note: This generates a MODEL_ID that you need for the next step
export NANONETS_MODEL_ID=YOUR_MODEL_ID
_Note: you will get YOUR_MODEL_ID from the previous step
The training data is found in images
(image files) and annotations
(annotations for the image files)
python ./code/upload-training.py
Once the Images have been uploaded, begin training the Model
python ./code/train-model.py
The model takes ~2 hours to train. You will get an email once the model is trained. In the meanwhile you check the state of the model
python ./code/model-state.py
Once the model is trained. You can make predictions using the model
python ./code/prediction.py PATH_TO_YOUR_IMAGE.jpg
Sample Usage:
python ./code/prediction.py ./images/151.jpg
Note the python sample uses the converted json instead of the xml payload for convenience purposes, hence it has no dependencies.
We've recently launched a tool that lets you easily convert PDFs to CSVs from here