-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update README.md #723
Update README.md #723
Conversation
@matyaskopp @osenova @maciej-ogrodniczuk : PLEASE UPDATE BRANCH IF CONSIDERED APPROPRIATE, Update ParlaMnt-ES.v-3.0 readme.md |
@calzada I quickly went through the README, which can be completed, and the obvious untruth can be fixed too. README inserting is done in the same way as the data, so please follow the contributing file guidelines CONTRIBUTING.md Untruths:
Possible extension:
If you want to include original ECPC XML it would probably be better to describe the conversion to this format first and next to describe the conversion to ParlaMint TEI:
|
Dear MAtyas,
I answer below.
El martes, 8 de agosto de 2023, Matyáš Kopp ***@***.***>
escribió:
… @calzada <https://github.com/calzada> I quickly went through the README,
which can be completed, and the obvious untruth can be fixed too. README
inserting is done in the same way as the data, so please follow the
contributing file guidelines CONTRIBUTING.md
<https://github.com/clarin-eric/ParlaMint/blob/main/CONTRIBUTING.md>
Untruths: NOT UNTRUTHS ;-)
- stanza was not used for annotations - Stanza was not used for
annotations in ParlaMint-ES.v3.0 but it was used for ParlaMint-ES-2.1. At
any rate,I was waiting to check when you finished the annotation. So I will
now just say, it was annotated with UDPipe for ParlaMint.es-v.3.0
- corpus specific metadata section is misleading
https://github.com/calzada/ParlaMint/tree/calzada-patch-
1-1/Data/ParlaMint-ES#corpus-specific-metadata
<https://github.com/calzada/ParlaMint/tree/calzada-patch-1-1/Data/ParlaMint-ES#corpus-specific-metadata>
- it seems that it does not describe ParlaMint-ES corpus -- I WILL
CHECK THIS AND REDO IT.
- parties were in fact re-roled to parliamentary groups, so ES
corpus does not encode both (#696 (comment)
<#696 (comment)>)I
WILL INCLUDE THIS.
- the conversion to TEI was done at the end of your pipeline, I am not
aware of any other quality contro.YES MONICA REVISED OUR XML FILES TO MAKE
SURE CERTAIN MISTAKES WERE ERADICATED. IN FACT MONICA USED CHATGPT TO THAT
EFFECT.
- I was engaged only in the last step (government members gathering,
conversion to TEI, and lingv. annotations), nothing else. WELL, YOU DID A
LOT OF WORK.
Possible extension:
- government members' acquisition: WHAT DO YOU MEAN BY THIS?
- you can explain why the chairman's speeches are not affiliated with
the exact person: OK I WILL EXPLAIN THIS.
If you want to include *original ECPC XML* it would probably be better to
describe the conversion to this format first and next to describe the
conversion to ParlaMint TEI:
- a process of conversion
- what has not been encoded (change party to group, constituencies of
all MPs ). WELL, IT ISA SHAME BECAUSE WE DID HAVE ALL THIS INFORMATION, BUT
WE USED TOMAZ ERJAVECˋS CONVERSION SINCE TIME WAS TIGHT. NEXT VERSION WILL
INCLUDE THIS EASILY.
- what has been newly added (government members, lingv. annotations,
coalition/opposition,...): GOVERNMENT MEMBERS WERE ALREADY ADDED IN
PARLAMINT-ES-v-2.1// lingv annotations were added in
ParlaMInt-ES-2.1//COALITION/OPPOSITION was there in ParlaMint-ES-2.1. SO
THESE ARENOT NEW INFORMATION ITEMS???
—
Reply to this email directly, view it on GitHub
<#723 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA2AREUX5JM2B3CROS3SNFLXUKC2DANCNFSM6AAAAAA3IPMSKQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Final question. I do not have my Github DEsktop and this is why I am having
problems updating the documentation. I cannot work the way I normally work
since i have a tablet here. I have to update the document online. But then
I am forced to pull a request. Is this alright? Does anyone checks on my
request?
Best for now,
mc
El martes, 8 de agosto de 2023, María Calzada Pérez ***@***.***>
escribió:
… Dear MAtyas,
I answer below.
El martes, 8 de agosto de 2023, Matyáš Kopp ***@***.***>
escribió:
> @calzada <https://github.com/calzada> I quickly went through the README,
> which can be completed, and the obvious untruth can be fixed too. README
> inserting is done in the same way as the data, so please follow the
> contributing file guidelines CONTRIBUTING.md
> <https://github.com/clarin-eric/ParlaMint/blob/main/CONTRIBUTING.md>
>
> Untruths: NOT UNTRUTHS ;-)
>
> - stanza was not used for annotations - Stanza was not used for
> annotations in ParlaMint-ES.v3.0 but it was used for ParlaMint-ES-2.1. At
> any rate,I was waiting to check when you finished the annotation. So I will
> now just say, it was annotated with UDPipe for ParlaMint.es-v.3.0
> - corpus specific metadata section is misleading
> https://github.com/calzada/ParlaMint/tree/calzada-patch-1-1/
> Data/ParlaMint-ES#corpus-specific-metadata
> <https://github.com/calzada/ParlaMint/tree/calzada-patch-1-1/Data/ParlaMint-ES#corpus-specific-metadata>
> - it seems that it does not describe ParlaMint-ES corpus -- I WILL
> CHECK THIS AND REDO IT.
> - parties were in fact re-roled to parliamentary groups, so ES
> corpus does not encode both (#696 (comment)
> <#696 (comment)>)I
> WILL INCLUDE THIS.
> - the conversion to TEI was done at the end of your pipeline, I am
> not aware of any other quality contro.YES MONICA REVISED OUR XML FILES TO
> MAKE SURE CERTAIN MISTAKES WERE ERADICATED. IN FACT MONICA USED CHATGPT TO
> THAT EFFECT.
> - I was engaged only in the last step (government members gathering,
> conversion to TEI, and lingv. annotations), nothing else. WELL, YOU DID A
> LOT OF WORK.
>
> Possible extension:
>
> - government members' acquisition: WHAT DO YOU MEAN BY THIS?
> - you can explain why the chairman's speeches are not affiliated with
> the exact person: OK I WILL EXPLAIN THIS.
>
> If you want to include *original ECPC XML* it would probably be better
> to describe the conversion to this format first and next to describe the
> conversion to ParlaMint TEI:
>
> - a process of conversion
> - what has not been encoded (change party to group, constituencies of
> all MPs ). WELL, IT ISA SHAME BECAUSE WE DID HAVE ALL THIS INFORMATION, BUT
> WE USED TOMAZ ERJAVECˋS CONVERSION SINCE TIME WAS TIGHT. NEXT VERSION WILL
> INCLUDE THIS EASILY.
> - what has been newly added (government members, lingv. annotations,
> coalition/opposition,...): GOVERNMENT MEMBERS WERE ALREADY ADDED IN
> PARLAMINT-ES-v-2.1// lingv annotations were added in
> ParlaMInt-ES-2.1//COALITION/OPPOSITION was there in ParlaMint-ES-2.1.
> SO THESE ARENOT NEW INFORMATION ITEMS???
>
> —
> Reply to this email directly, view it on GitHub
> <#723 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AA2AREUX5JM2B3CROS3SNFLXUKC2DANCNFSM6AAAAAA3IPMSKQ>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
Not only UDPipe, but also NameTag. See: <appInfo>
<application ident="UDPipe" version="2">
<label>UDPipe 2 (spanish-ancora-ud-2.10-220711 model)</label>
<desc xml:lang="en">POS tagging, lemmatization and dependency parsing done with UDPipe 2 (<ref target="http://ufal.mff.cuni.cz/udpipe/2">http://ufal.mff.cuni.cz/udpipe/2</ref>) with spanish-ancora-ud-2.10-220711 model</desc>
</application>
<application ident="NameTag" version="2">
<label>NameTag 2 (spanish-conll-200831 model)</label>
<desc>Name entity recognition done with NameTag 2 (<ref target="http://ufal.mff.cuni.cz/nametag/2">http://ufal.mff.cuni.cz/nametag/2</ref>) with spanish-conll-200831 model.</desc>
</application>
</appInfo> And you can also insert lindat acknowledgements to fulfil the terms of use of lindat tools:
That sounds interesting, it can be mentioned in the documentation and ideally supported by an example where chatgpt helps, and highlight that the final word has a human not AI, so you did not introduce more noise in the data.
That the complete information about the members of the government is not present in CD format:
I used wget to download wiki pages and a script for extracting information from html table to TEI: gov-wiki2tei.pl
I am unsure if it is easy because you should also include the relationship between the party and the parliamentary group. The parliamentary groups also need a full definition (not only abbreviation).
Sorry I meant the difference between original ECPC XML and ParlaMint-ES
update read at the place where you did it now. I will insert it together with this pull request: #692 |
Final update of REadme.md
FInal + 1 update
Final and definite version of update (10-08-23)
Updating readme again. 10-08-23
Yet another refinement. 10-08-23.
Am closing this pull request, I think it is no longer relevant. |
No description provided.