Replies: 4 comments 8 replies
-
Hi @dhifafaz 👋, Firstly i suggest to use The best option would be to provide some real samples for fine tuning (~2K train / 400 val) by keeping the french vocab (this will not reset the classifier head). Could you please attach some samples from the try with the BTW: no need to augment your images in front. We do this internal: (please check that you are on the actual docTR version from main branch -> 0.7.0)
|
Beta Was this translation helpful? Give feedback.
-
@felixT2K Hai i'm back, im sorry i still get the same problem, the cuda and all of that is freshly installed, because i'm using a different envitontment... |
Beta Was this translation helpful? Give feedback.
-
Currently i'm trying your approach that says "provide some real examples", i wonder how the train command to achieve the doctr pretrained text recognition capability using my generated dataset with custom font.. |
Beta Was this translation helpful? Give feedback.
-
@dhifafaz short update: You should try the pretrained
|
Beta Was this translation helpful? Give feedback.
-
What I want to do
What I have done and the problem
trdg -c 108000 -dt id-2.txt -fd /doc-ext/docTR-finetune/fonts -w 3 -b 0 -k 3 -rk -rs -tc '#000000,#888888' --output_dir /doc-ext/docTR-finetune/dataset-text-reco-v3/train_set/images -f 32 -t 20 -sw 0 -na 2 --margins 3,3,3,3
I do the same things for the val_set with much lower amount. And i try to fintuned it with this command on doctr
But because the missing image (so it's TRDG's fault) the training process stopped in the middle. But it manage to save the best model. And when i try to use it with a end-to-end doctr extraction, but with my finetuned model.pt, the results are so different than it should. This the way i load the finetuned model
ps: i also try to change the vocab parameter and even not to use it. It still give me the bad result, while the exact and partial match is almost 90% with loss 0.05 on training process.
The result with the same way to load the finetuned model as i mention earlier are pretty much the same. It is so different from the pdf that i tested. This is the result that i get
this the pdf document
with the code that i use look like this
And i calculate the result from 1000 data with CER (Character Error Rate) from evaluate package on HF, it gives error value only 0.05.
My question is what is happening in here ? how to reproduce the current CRNN_VGG_16 capability with my custom fonts, i need to know where is my fault.
Thank you so much!
Beta Was this translation helpful? Give feedback.
All reactions