Generate text images for training deep learning OCR model (e.g. CRNN). Support both latin and non-latin text.
- Ubuntu 16.04
- python 3.5+
Install dependencies:
pip3 install -r requirements.txt
By default, simply run python3 main.py
will generate 20 text images
and a labels.txt file in output/default/
.
-
Please run
python3 main.py --help
to see all optional arguments and their meanings. And put your own data in corresponding folder. -
Config text effects and fraction in
configs/default.yaml
file(or create a new config file and use it by--config_file
option), here are some examples:
- Run
main.py
file.
For no-latin language(e.g Chinese), it's very common that some fonts only support limited chars. In this case, you will get bad results like these:
Select fonts that support all chars in --chars_file
is annoying.
Run main.py
with --strict
option, renderer will retry get text from
corpus during generate processing until all chars are supported by a font.
You can use check_font.py
script to check how many chars your font not support in --chars_file
:
python3 tools/check_font.py
checking font ./data/fonts/eng/Hack-Regular.ttf
chars not supported(4971):
['第', '朱', '广', '沪', '联', '自', '治', '县', '驼', '身', '进', '行', '纳', '税', '防', '火', '墙', '掏', '心', '内', '容', '万', '警','钟', '上', '了', '解'...]
0 fonts support all chars(5071) in ./data/chars/chn.txt:
[]
If you want to use GPU to make generate image faster, first compile opencv with CUDA. Compiling OpenCV with CUDA support
Then build Cython part, and add --gpu
option when run main.py
cd libs/gpu
python3 setup.py build_ext --inplace
Run python3 main.py --debug
will save images with extract information.
You can see how perspectiveTransform works and all bounding/rotated boxes.