Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug fix: OMIMPS URI prefix #109

Merged
merged 1 commit into from
May 31, 2024
Merged

Bug fix: OMIMPS URI prefix #109

merged 1 commit into from
May 31, 2024

Conversation

joeflack4
Copy link
Contributor

@joeflack4 joeflack4 commented Apr 29, 2024

Changes

  • Update: URI prefix for OMIMPS.

Additional info

This manifested in the prefix not getting collapsed in omim.ttl. I'm not sure if there were any other consequences aside from that on the omim repo side. But I would imagine that having the URL out of sync with what is in mondo-ingest is breaking something, though I'm surprised we would not have noticed yet.

Related:

@joeflack4 joeflack4 self-assigned this Apr 29, 2024
@joeflack4 joeflack4 added the bug Something isn't working label Apr 29, 2024
@@ -77,7 +77,7 @@
# LIDIA seems retired. so these are not resovable # Also: http://www.vetsci.usyd.edu.au/lida/
'LIDA': 'http://sydney.edu.au/vetscience/lida/dogs/search/disorder/' # Listing of Inherited Disorders in Animals (defunct?)
'OMIM': 'https://omim.org/entry/' # Online Mendelian Inheritance in Man (human disease and variants)
'OMIMPS': 'https://www.omim.org/phenotypicSeries/PS' # Online Mendelian Inheritance in Man (phenotypes)
'OMIMPS': 'https://omim.org/phenotypicSeries/PS' # Online Mendelian Inheritance in Man (phenotypes)
Copy link
Contributor Author

@joeflack4 joeflack4 Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change

Updated the OMIMPS URI prefix to comply with what is in mondo-ingest.

Background

OK, so I was surprised to see this. This change had already been made in #107 (comment) on Feb 3rd, which I noticed and commented on.

But Nico undid those changes in a commit to main on Feb 7th.

Then, Sabrina noticed an issue on Feb 12th monarch-initiative/mondo-ingest#438.

I didn't get to work on fixing this until Mar 29th monarch-initiative/mondo-ingest#478. By that time, I'd forgotten about Nico having undone the change on Feb 7th.

What I'm not understanding though is how that PR was able to fix Sabrina's issue. Because that PR removed www from the URI in mondo-ingest. However, www remained in the omim repository.
I thought that maybe there was an OMIM release between Feb 3rd and Feb 7th, but I checked and there is none; only one release on the 7th, with Nico's commit message showing he'd added back www. So I have no idea how this is the case... that Sabrina's issue manifested... unless maybe there was actually a release in between those dates that was later deleted. Additionally, I don't know how my pull request was able to fix this issue. Rather, it should have broken things.

QC

I think that some QC is needed here before merging. How does this plan sound?:

  1. I run an omim release using this branch.
  2. I run a build of mondo-ingest from this branch where the URI prefix also needed to be updated: Bug fix: OMIMPS URI prefix mondo-ingest#504. Or I could run sh run.sh make slurp-omim. It will use that release.
  3. I check the slurp/omim.tsv and make sure that everything looks OK.
  4. Merge this PR and Bug fix: OMIMPS URI prefix mondo-ingest#504

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cant be of help here, other than saying that I undid that change in February for fear of unknown downstream repercussions - which I myself did not want to solve :P Sorry for this not-helpful comment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joeflack4 with the change you propose here for the OMIMPS URI, did you run through the QC steps you listed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be steps (1) through (3) above. Not yet. I'll tick off the box and also comment here when I go through these steps, assuming everything passes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'd test this change to make sure the data that was problematic previously actually is fixed by this change given the churn of the history of what you detailed in the "Background".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you both are interested in a proper deep dive of the issue, I recommend to read this: biopragmatics/bioregistry#497

Just my two cents: I do not share the urgency with Chris and other community member to getting this fixed, but I guess its your turn now to decide these kinds of things! Happy to discuss it also in our next meeting

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matentzn That is sorta unrelated to this thread, which is just about running QC on the change regarding www removal.
But oh yes, I will give that a proper read if / when I work on this. I created an issue for this for later:

Copy link
Contributor Author

@joeflack4 joeflack4 May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran the mondo-ingest build, but it's erroring out with a lexmatch error. I created an issue for that:

However, I did take a look at the slurp/omim.tsv before and after the changes in this PR; that's my main barometer for gauging if this change is working fine, and so far it looks good to me. I'll likely check the lexmatch outputs as well. Lemme know if you guys have any other ideas for QC checking.

omim tsv comparison.zip

I also ran the lexmatch on this build clone using my debugger, which was using an older version of oaklib which circumvented the error. Doing so allowed me to get these outputs, in which I see that OMIMPS appears:
lexmatch outputs.zip

Copy link
Contributor Author

@joeflack4 joeflack4 May 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build success & my QC check passed

I did a build with no errors:

My QC check is simply seeing that OMIMPS entries still appear in slurp/omim.tsv and mondo-sources-all-lexical.sssom.tsv. The comment above includes these outputs and, though they were ran and attached before #520 completed, I just looked at the output files in that PR and they also look good.

Given that, I'm happy to now to bring closure to the following PRs:

… is good for standardization reasons, and necessary to comply w/ what the URI is set to in the rest of the mondo-ingest pipeline.
@joeflack4 joeflack4 changed the title Bugfix: OMIMPS URI prefix Bug fix: OMIMPS URI prefix Apr 29, 2024
Copy link
Contributor

@twhetzel twhetzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for testing this change. Given that everything sounds like it's work as it should, approving.

@joeflack4 joeflack4 merged commit 8299957 into main May 31, 2024
1 check passed
@joeflack4 joeflack4 deleted the omimps-prefix branch May 31, 2024 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants