docTR training reference datasets #1654
felixdittrich92
started this conversation in
Show and tell
Replies: 1 comment 5 replies
-
Hi guys! Thank for your excellent work! Could you provide an example how to train detection model for example using internal datasets? I see that train_tensorflow.py expects a little bit different format from training set. Internal dataset training set: Expected: {
"sample_img_01.png" = {
'img_dimensions': (900, 600),
'img_hash': "theimagedumpmyhash",
'polygons': [[[x1, y1], [x2, y2], [x3, y3], [x4, y4]], ...]
},
"sample_img_02.png" = {
'img_dimensions': (900, 600),
'img_hash': "thisisahash",
'polygons': [[[x1, y1], [x2, y2], [x3, y3], [x4, y4]], ...]
}
...
} |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The provided link contains reference datasets:
NOTE: train and val contains the same data. You should split your custom dataset and avoid duplications.
detection_task
: docTR detection trainingrecognition_task
: docTR recognition trainingReference datasets: Datasets
Docs: Training documentation
Recognition: README
Detection: README
Beta Was this translation helpful? Give feedback.
All reactions