Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QC checks #385

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

QC checks #385

wants to merge 1 commit into from

Conversation

joeflack4
Copy link
Contributor

@joeflack4 joeflack4 commented Dec 8, 2023

Updates

QC checks

  • Add: QC check for duplicate mappings
  • Add: QC check to detect if any IDs for any sources are in SSSOM but not in Mondo itself.

Addresses

Related

@joeflack4
Copy link
Contributor Author

@hrshdhgd @souzadevinicius We want to actually use a proper testing framework and setup as in #319, but we want to discuss setting that up in a standard, approved way before merging it, so we're getting the ball rolling now with some QC that involve simply running scripts.

Copy link
Contributor Author

@joeflack4 joeflack4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matentzn Mergeable unless you find any mistakes or have any suggestions.

tests/__init__.py Show resolved Hide resolved
tests/check_dupe_exact_mappings.py Outdated Show resolved Hide resolved
tests/check_sssom_in_sync.py Show resolved Hide resolved
src/ontology/mondo-ingest.Makefile Show resolved Hide resolved
Copy link
Member

@matentzn matentzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to discuss this in a call, as there are two architectural questions attached:

  1. All mondo related python code should be in mondolib so it gets the same code quality practices associated and shared code can be levaraged
  2. IMO the check you have here is Mondo-level, not Mondo-ingest level, so there is a question of where it should be deployed.

@@ -0,0 +1,81 @@
"""Report duplciate exact mappings in mondo.sssom.tsv"""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Report duplicate exact mappings in mondo.sssom.tsv

One of the QC checks.

@@ -0,0 +1,90 @@
"""Checks that SSSOM is in sync with Mondo by ensuring that all SSSOM doesn't have any novel IDs."""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check SSSOM and ontology in sync

One of the QC checks.

- Add: QC check for duplicate mappings
- Add: QC check to detect if any IDs for any sources are in SSSOM but not in Mondo itself.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

QC check for: Set of Mondo terms in mondo.owl & mondo.sssom.tsv differ
2 participants