-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
slash vs dash (IATI vs PublicBodies); "aggregator conflict" #152
Comments
@VladimirAlexiev The org-id.guide methodology includes the idea of a 'primary' register and always prefers these over secondary aggregations. XI-OC is unlikely to ever be created as a register/organisation identifier list, as it simply republishes information from existing registers - so in all the cases that there is an identifier in Open Corporates, it should be possible to cross-reference it to an official register, and publish it as such. The use of the codes (GB-COH) rather than URIs, is so that users can choose which endpoint to resolve an identifier against, and to be robust against changes in URI patterns. (E.g., faced with GB-COH-07444723, with the meta-data available in org-id.guide the user could choose to resolve against Open Corporates data at https://opencorporates.com/companies/gb/07444723 or Companies House data at http://data.companieshouse.gov.uk/doc/company/07444723) |
Ok, got it. But:
Given that linked data / semantic web is now the predominant way of doing inter-enterprise data integration, would't it be nice for IATI to think of permanent URLs? |
Happy to look at getting this included. If you can suggest best way to capture this, we could add to the schema for org-id.guide meta-data.
@VladimirAlexiev I'm not sure that is true. Whilst getting the second part of a URI standardised well may be possible - experience suggests that getting agreement on using the same domain - or maintenance of dereferenceable URIs at a particular location - is far from easy, and tends to undermine attempts to use URIs for data integration across distributed publication. |
Wikidata has 3 such props:
That seems like a bad excuse not to try it. There are many successful examples where this has happened, eg
If you have several URLs for an entity, |
And when the particular national register is not online, still use the OC site in formatterURL? That's a good idea. |
OpenCorporates only publishes records that are already online in one format
or another.
Regards
Ben
…On Mon, May 22, 2017 at 10:29 AM, Vladimir Alexiev ***@***.*** > wrote:
identifier in Open Corporates, it should be possible to cross-reference it
to an official register, and publish it as such.
And when the particular national register is not online, still use the OC
site in formatterURL? That's a good idea.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#152 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/APnVarbZIq2icuAaw9MqoOrFU1GCP6Ucks5r8VWHgaJpZM4NhNcG>
.
|
@IATI/bas Is this resolved by the org-id changes? |
That is not entirely true. Many registers publish data in weird non-user-friendly and non-web-friendly ways, while @openc makes that data uniformly available. Eg the BG register hides companies behind MS.NET postbacks and CAPTCHA. Also, there aren't company page URLs including the official ID. There are pages keyed by an ugly GUID (eg https://public.brra.bg/CheckUps/Verifications/ActiveCondition.ra?guid=617f4edf8c154f4296efdf146513de21 for EIK=204060254) and even these are behind CAPTCHA. @openc doesn't yet have the BG register online but hopefully will soon, as part of @euBusinessGraph. The full data is dumped at http://opendata.government.bg/dataset/tbprobckn-pernctbp, we've analyzed it and a simple version is at http://data.businessgraph.io, eg see http://data.businessgraph.io/resource?uri=http://data.businessgraph.io/company/BG/200356710 |
IATI follows the guidance provide by org-id.guide. I am closing the issue on the IATI github. The org-id GitHub is here: https://github.com/org-id/register/issues |
The "slash vs underscore" issue (split off from datasets/publicbodies#74) reflects a big difference in philosophy, so I believe it merits a new issue to be opened.
If we register OpenCorporates (who have info on 127M companies) as XI-OC in IATI, we'll have a similar issue:
Since there is no RO prefix in IATI (and maybe the official RO registry is not yet openly available), XI-OC-ro would be a very useful prefix.
And this raises a bigger issue (@CountCulture), consider
http://data.companieshouse.gov.uk/doc/company/07444723, which is the same entity as
https://beta.companieshouse.gov.uk/company/07444723 and also
https://opencorporates.com/companies/gb/07444723.
The GB official register is online and registered in IATI.
So one should prefer GB-COH-07444723 to XI-OC-gb-07444723.
But does this mean when an official register becomes available, we should deprecate "aggregator" identifier schemes or URLs (like OpenCorporates or PublicBodies) in favor of that official register?
The text was updated successfully, but these errors were encountered: