Release MMOCR Release v0.4.0 · open-mmlab/mmocr

Highlights

We release a new text recognition model - ABINet (CVPR 2021, Oral). With dedicated model design and useful data augmentation transforms, ABINet achieves the best performance on irregular text recognition tasks. Check it out!
We are also working hard to fulfill the requests from our community. OpenSet KIE is one of the achievements, which extends the application of SDMGR from text node classification to node-pair relation extraction. We also provide a demo script to convert WildReceipt to open set domain, though it may not take full advantage of the OpenSet format. For more information, read our tutorial.
APIs of models can be exposed through TorchServe. Docs

Breaking Changes & Migration Guide

Postprocessor

Some refactoring processes are still going on. For all text detection models, we unified their decode implementations into a new module category, POSTPROCESSOR, which is responsible for decoding different raw outputs into boundary instances. In all text detection configs, the text_repr_type argument in bbox_head is deprecated and will be removed in the future release.

Migration Guide: Find a similar line from detection model's config:

text_repr_type=xxx,

And replace it with

postprocessor=dict(type='{MODEL_NAME}Postprocessor', text_repr_type=xxx)),

Take a snippet of PANet's config as an example. Before the change, its config for bbox_head looks like:

    bbox_head=dict(
        type='PANHead',
        text_repr_type='poly',
        in_channels=[128, 128, 128, 128],
        out_channels=6,
        loss=dict(type='PANLoss')),

Afterwards:

    bbox_head=dict(
    type='PANHead',
    in_channels=[128, 128, 128, 128],
    out_channels=6,
    loss=dict(type='PANLoss'),
    postprocessor=dict(type='PANPostprocessor', text_repr_type='poly')),

There are other postprocessors and each takes different arguments. Interested users can find their interfaces or implementations in mmocr/models/textdet/postprocess or through our api docs.

New Config Structure

We reorganized the configs/ directory by extracting reusable sections into configs/_base_. Now the directory tree of configs/_base_ is organized as follows:

_base_
├── det_datasets
├── det_models
├── det_pipelines
├── recog_datasets
├── recog_models
├── recog_pipelines
└── schedules

Most of model configs are making full use of base configs now, which makes the overall structural clearer and facilitates fair comparison across models. Despite the seemingly significant hierarchical difference, these changes would not break the backward compatibility as the names of model configs remain the same.

New Features

Support openset kie by @cuhk-hbsun in #498
Add converter for the Open Images v5 text annotations by Krylov et al. by @baudm in #497
Support Chinese for kie show result by @cuhk-hbsun in #464
Add TorchServe support for text detection and recognition by @Harold-lkk in #522
Save filename in text detection test results by @cuhk-hbsun in #570
Add codespell pre-commit hook and fix typos by @gaotongxiao in #520
Avoid duplicate placeholder docs in CN by @gaotongxiao in #582
Save results to json file for kie. by @cuhk-hbsun in #589
Add SAR_CN to ocr.py by @gaotongxiao in #579
mim extension for windows by @gaotongxiao in #641
Support muitiple pipelines for different datasets by @cuhk-hbsun in #657
ABINet Framework by @gaotongxiao in #651

Refactoring

Refactor textrecog config structure by @cuhk-hbsun in #617
Refactor text detection config by @cuhk-hbsun in #626
refactor transformer modules by @cuhk-hbsun in #618
refactor textdet postprocess by @cuhk-hbsun in #640

Docs

C++ example section by @apiaccess21 in #593
install.md Chinese section by @A465539338 in #364
Add Chinese Translation of deployment.md. by @fatfishZhao in #506
Fix a model link and add the metafile for SATRN by @gaotongxiao in #473
Improve docs style by @gaotongxiao in #474
Enhancement & sync Chinese docs by @gaotongxiao in #492
TorchServe docs by @gaotongxiao in #539
Update docs menu by @gaotongxiao in #564
Docs for KIE CloseSet & OpenSet by @gaotongxiao in #573
Fix broken links by @gaotongxiao in #576
Docstring for text recognition models by @gaotongxiao in #562
Add MMFlow & MIM by @gaotongxiao in #597
Add MMFewShot by @gaotongxiao in #621
Update model readme by @gaotongxiao in #604
Add input size check to model_inference by @mpena-vina in #633
Docstring for textdet models by @gaotongxiao in #561
Add MMHuman3D in readme by @gaotongxiao in #644
Use shared menu from theme instead by @gaotongxiao in #655
Refactor docs structure by @gaotongxiao in #662
Docs fix by @gaotongxiao in #664

Enhancements

Use bounding box around polygon instead of within polygon by @alexander-soare in #469
Add CITATION.cff by @gaotongxiao in #476
Add py3.9 CI by @gaotongxiao in #475
update model-index.yml by @gaotongxiao in #484
Use container in CI by @gaotongxiao in #502
CircleCI Setup by @gaotongxiao in #611
Remove unnecessary custom_import from train.py by @gaotongxiao in #603
Change the upper version of mmcv to 1.5.0 by @zhouzaida in #628
Update CircleCI by @gaotongxiao in #631
Pass custom_hooks to MMCV by @gaotongxiao in #609
Skip CI when some specific files were changed by @gaotongxiao in #642
Add markdown linter in pre-commit hook by @gaotongxiao in #643
Use shape from loaded image by @cuhk-hbsun in #652
Cancel previous runs that are not completed by @Harold-lkk in #666

Bug Fixes

Modify algorithm "sar" weights path in metafile by @ShoupingShan in #581
Fix Cuda CI by @gaotongxiao in #472
Fix image export in test.py for KIE models by @gaotongxiao in #486
Allow invalid polygons in intersection and union by default by @gaotongxiao in #471
Update checkpoints' links for SATRN by @gaotongxiao in #518
Fix converting to onnx bug because of changing key from img_shape to resize_shape by @Harold-lkk in #523
Fix PyTorch 1.6 incompatible checkpoints by @gaotongxiao in #540
Fix paper field in metafiles by @gaotongxiao in #550
Unify recognition task names in metafiles by @gaotongxiao in #548
Fix py3.9 CI by @gaotongxiao in #563
Always map location to cpu when loading checkpoint by @gaotongxiao in #567
Fix wrong model builder in recog_test_imgs by @gaotongxiao in #574
Improve dbnet r50 by fixing img std by @gaotongxiao in #578
Fix resource warning: unclosed file by @cuhk-hbsun in #577
Fix bug that same start_point for different texts in draw_texts_by_pil by @cuhk-hbsun in #587
Keep original texts for kie by @cuhk-hbsun in #588
Fix random seed by @gaotongxiao in #600
Fix DBNet_r50 config by @gaotongxiao in #625
Change SBC case to DBC case by @cuhk-hbsun in #632
Fix kie demo by @innerlee in #610
fix type check by @cuhk-hbsun in #650
Remove depreciated image validator in totaltext converter by @gaotongxiao in #661
Fix change locals() dict by @Fei-Wang in #663
fix #614: textsnake targets by @HolyCrap96 in #660

New Contributors

@alexander-soare made their first contribution in #469
@A465539338 made their first contribution in #364
@fatfishZhao made their first contribution in #506
@baudm made their first contribution in #497
@ShoupingShan made their first contribution in #581
@apiaccess21 made their first contribution in #593
@zhouzaida made their first contribution in #628
@mpena-vina made their first contribution in #633
@Fei-Wang made their first contribution in #663

Full Changelog: v0.3.0...v0.4.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MMOCR Release v0.4.0