Skip to content
This repository has been archived by the owner on Dec 29, 2021. It is now read-only.

company_name search does not always return results #7

Open
jerico opened this issue Oct 17, 2015 · 10 comments
Open

company_name search does not always return results #7

jerico opened this issue Oct 17, 2015 · 10 comments

Comments

@jerico
Copy link

jerico commented Oct 17, 2015

I'm listing companies from company_name key. Other company name work, some don't.

If I try "company_name=OceanaGold (Philippines), Inc. (Contractor/Operator)", it does not return any result. It should return 1 contract associated with that company.

http://api.resourcecontracts.org/contracts/search?from=0&per_page=1000&group=metadata&country=ph&company_name=OceanaGold%20(Philippines),%20Inc.%20(Contractor%2FOperator)&

@anderspeders
Copy link

@anjesh Can you follow up here?

@anjesh
Copy link
Contributor

anjesh commented Oct 19, 2015

I think this has to do with the punctuation characters. We need to explore a bit on the elasticsearch indexing features. I remove comma from one of the companies http://contracts.ph-eiti.org/contract/101 and now it's appearing in the results. http://contracts.ph-eiti.org/search?q=&year=&resource=&company_name=Adnama%20Mining%20Resources%20Incorporated

@anderspeders
Copy link

also - would it make sense to push this repo to NRGI/RC as well?

@anjesh
Copy link
Contributor

anjesh commented Oct 19, 2015

Yes definitely. I do have plans to move this and 2 more repo (subsite and elasticsearch) to NRGI github.

@anderspeders
Copy link

Great, thanks.

On Mon, Oct 19, 2015 at 12:50 PM, Anjesh [email protected] wrote:

Yes definitely. I do have plans to move this and 2 more repo (subsite and
elasticsearch) to NRGI github.


Reply to this email directly or view it on GitHub
#7 (comment)
.

@anjesh
Copy link
Contributor

anjesh commented Oct 20, 2015

Presence of comma in the company name is barring the contracts from appearing in the results. One thing we could do here is remove the comma from the company name, as you may see the result appears when the comma is removed.

image

image

However the system doesn't differentiate the supporting contracts from the main contracts as of now (except that there's relationship), so when the company names are returned from the API, it gives all the company names including the ones from the supporting ones as well. And I see that Jerico has hidden all the supporting contracts, perhaps using text pattern "annex?" search. If the company name in supporting documents is different from principal document, then we will see the company name whereas it won't display any results.

@charlesyoung

@charlesyoung
Copy link

Yep is a comma issue, same with Forum Exploration, Incorporated.

There shouldn't a comma in the name so can we build some form of validation when capturing the company name in the admin module?

I will for now update the contracts and annex's linked with a company that has a comma in the name like I just did for Far Southeast Gold Resources Incorporated.

@charlesyoung
Copy link

Busy updating, found more issues.

Another issue is that the system crashes when the name includes a bracket (OceanaGold (Philippines), Incorporated - FTAA No. 001, 1994). I have asked Jerico to speak to Joy to update.

Also doesn't support hyphens (Rapu-Rapu Minerals, Incorporated - MPSA No. 163-2000-V, 2000) which is a problem because in this example the company name needs a hyphen.

@charlesyoung
Copy link

Scrape above, was related to the site going down.

@charlesyoung
Copy link

Interesting that the contract name isn't updated when I remove the comma.

image

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants