Skip to content

Vishnunkumar/doc_transformers

Repository files navigation

Doc Transformers

Document processing using transformers. This is still in developmental phase, currently supports only extraction of form data i.e (key - value pairs)

pip install -q doc-transformers

Pre-requisites

Please install the following seperately

pip install pip --upgrade
pip install -q git+https://github.com/huggingface/transformers.git

pip install pyyaml==5.1

# workaround: install old version of pytorch since detectron2 hasn't released packages for pytorch 1.9 (issue: https://github.com/facebookresearch/detectron2/issues/3158)
pip install torch==1.8.0+cu101 torchvision==0.9.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html

# install detectron2 that matches pytorch 1.8
# See https://detectron2.readthedocs.io/tutorials/install.html for instructions
pip install -q detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.8/index.html

Implementation

# loads the pretrained dataset also 
from doc_transformers import parser

# loads the image and labels
image = parser.load_image(input_path_image)
labels = parser.load_tags()

# loads the model
feature_extractor, processor, model = parser.load_models()

# gets the bounding boxes, predictions, extracted words and image processed
kp = parser.process_image(image, feature_extractor, processor, model, labels)

Results

Input & Output

Table

  • After saving to csv the result looks like the following
LABEL TEXT
title CREDIT CARD VOUCHER ANY RESTAURANT
title ANYWHERE
key DATE:
value 02/02/2014
key TIME:
value 11:11
key CARD
key TYPE:
value MC
key ACCT:
value XXXX XXXX XXXX
value 1111
key TRANS
key KEY:
value HYU8789798234
key AUTH
key CODE:
value 12345
key EXP
key DATE:
value XX/XX
key CHECK:
value 1111
key TABLE:
value 11/11
key SERVER:
value 34
value MONIKA
key Subtotal:
value $1969
value .69
key Gratuity: Total:

Code credits

@HuggingFace

  • Please note that this is still in development phase and will be improved in the near future