-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Part_id bug with idp4 dataset from tagtog #113
Comments
Unfortunately Shpend's annotations are rather difficult to recover, since all figure annotations offsets are wrong (due the the old parsing having all wrong figure captions). Ectelion's still work so we go with them. Just in case, here is the Shpend's original annotations backup NOTE the wrong s4s1p2 annotations were already changed to s4s1f1p1, EXCEPT for the relations offsets ... The relation offsets mappings for those should be:
Shpendi-aYwkx1JUQj1EJKF0BpuDGtqNMqnK-PMC3613162.ann.json.zip |
New html: Rather Use: Rostlab/nala@a979175 (this in xml form) |
@abojchevski @carstenuhlig corrected on tagtog and html in nala. Now it's your turn to decide how to put Ectelion's ann.json to the corpus and to the bootstrapping 0 Let me know if you have questions |
@abojchevski @carstenuhlig did you guys include this on itr0 with the correct files? |
I believe so. However it was a long time ago and I can't be sure. |
See #169 |
Indeed old doc didn't have
s4s1f1p1
but its text was somehow included at the end ofs4s1p2
. By parsing the doc again in latest version of tagtog,s4s1f1p1
is indeed created. It maybe have been an old bug on tagtog. The old doc contain in nala was parsed byNcbiJournalArticleParser_v0_3
whereas tagtog now usesNcbiJournalArticleParser_v0_4
(that is, a newer version). Because of wrong text placement, the bug on the display results.Besides, only
Ectelion
(Rustem) andShpend
had annotated the doc. Eceletion annotations in that part refer to the corrects4s1f1p1
whereas Shpend also has annotation in that part but they refer to the wrong `s4s1p2. Sanjeev contains an empty ann.json that can be safely deleted.Tasks:
[ ] Reupload Shpend's correcting the offsets--> Decided it's not worth the effort. See comment: Part_id bug with idp4 dataset from tagtog #113 (comment)
Shpendinala team decides whether to replace the html in the bootstrap 0 and whether and how to include the ann.json (after possible merge)Original description:
For the document
aYwkx1JUQj1EJKF0BpuDGtqNMqnK-PMC3613162
s4s1f1p1
annotatable
Additionaly, when you open it on tagtog the following weird thing is going on:
The text was updated successfully, but these errors were encountered: