Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ZA] Restore plenary and committee appearances for MPs #2458

Closed
crowbot opened this issue Aug 13, 2018 · 7 comments
Closed

[ZA] Restore plenary and committee appearances for MPs #2458

crowbot opened this issue Aug 13, 2018 · 7 comments

Comments

@crowbot
Copy link
Member

crowbot commented Aug 13, 2018

No description provided.

@crowbot crowbot changed the title Restore plenary and committee appearances for MPs [ZA] Restore plenary and committee appearances for MPs Aug 13, 2018
@crowbot
Copy link
Member Author

crowbot commented Aug 24, 2018

Committee appearances restored, plenary deprioritised

@chrismytton
Copy link
Member

We had another request for the plenary appearances from Megan yesterday:

Plenary Appearances are not updating on the MPs' profiles. Parliament was very behind for years but a year ago it solved this. However, Plenary Appearances are still stuck on 2015. PA should be looking at either Parliament Hansard or PMG Hansard or PA Hansard but it is not working anymore. [Note: PA Hansard was removed from the menu bar last year but that does not mean that it is not still working].

See a couple of examples:

https://www.pa.org.za/person/cornelius-petrus-mulder/#appearances
https://www.pa.org.za/person/ahmed-munzoor-shaik-emam/#appearances

@chrismytton
Copy link
Member

@jacksonj04 I've assigned this ticket to me for now, hope you don't mind! I think it's going to be closely related to the work I'm doing on #2554.

@chrismytton
Copy link
Member

Parliament’s website has been redesigned and the proceedings are now published as PDFs, but our scraper only handles Microsoft Word documents, since that’s what they used to be published as.

Re-writing the scraper and testing it against the new Hansard PDFs will take more time than we have remaining in the current grant period.

Perhaps a solution in the meantime would be to link to the PDFs directly from the Hansard pages on pa.org.za? That way at least people can still read the transcripts.

@chrismytton
Copy link
Member

The new plan is to switch za-hansard to get the Word documents from the PMG website. These documents should be in the same format as the ones we were previously getting from parliament.

I've asked PMG for an example of the Word docs, and for details on where we'll be able to get a list of these docs.

@chrismytton chrismytton mentioned this issue Aug 23, 2019
2 tasks
@chrismytton
Copy link
Member

I've switched the za_hansard app (which is now in this repo, rather than a separate one) to get the Word docs from PMG's site, see #2672.

We're not quite done though, because the files that PMG have been producing are in newer ".docx" format, but the Hansard parser expects ".doc" files. So I've asked PMG to produce files in the older Word format and upload them to their site, at which point they'll appear in the API. So this issue is on hold until we've tested that the changes are working with the new documents that PMG produce.

@chrismytton
Copy link
Member

PMG are now uploading old-style Word documents to their API, which means the Hansard transcripts are appearing on the site once again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants