-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add swisslipds import #242
Comments
@udp - tagging you here until you are properly added to repo. |
Some background: Motivating request: EBISPOT/efo#1882 There is a strong desire to capture trait associations for very specific metabolites as this data could potentially be mined for biomarkers (we really need some concrete examples to flesh this case out for the GWAS catalog). GWAS curator time is limited, so if they can simply provide lists of IDs to us and we have the tools to rapidly generate usable terms then that keeps everyone happy. The current situation is arguably less efficient. Tables like this are curated by hand by GWAS editors, with mappings to parent classes (e.g. cermide measurement) being done lexically by curators: If we decided not to create such detailed terms in OBA, there would still be a need to map up. It is worth asking whether lexical mapping by curators is the most efficient & accurate way to achieve this. If, in future, we move composition to the database side and work with GWAS on an exchange format, work on standards for which IDs are acceptable, mappings and semantics will still be useful. |
We seem to have already found lipids that aren't in swisslipids: |
swisslipids ontology available from
curl -L -H 'accept:text/turtle' 'https://beta.sparql.swisslipids.org/sparql/'
--data 'query=PREFIX+foaf%3a+%3chttp%3a%2f%2fxmlns.com%2ffoaf%2f0.1%2f%3e%0d%0aCONSTRUCT+%7b%0d%0a++%3fs+%3fp+%3fo+.%0d%0a%7d+WHERE+%7b%0d%0a++GRAPH+%3chttps%3a%2f%2fsparql.swisslipids.org%2fswisslipids%3e%7b%0d%0a++++%3fs+%3fp+%3fo+.%0d%0a++++FILTER(!sameTerm(%3fp%2c+foaf%3adepiction))%0d%0a%09%7d%0d%0a%7d'
-o swisslipids.ttl
(they may provide a more stable endpoint in future)
This can easily be converted into a module for imports using robot extract. I successfully ran a test of this locally using swisslipid terms from https://docs.google.com/spreadsheets/d/1HWZfjHak388sXwjhueH3qzCzegkzdp-DLvY4mC9LEaw/edit#gid=0 => a 1.5MB module (from the initial 1.5GB ontology).
In order to generate a usable import for OBA, we would need an additional step on top of the usual module extraction (this should be added to the oba extension makefile). The source file and extracted module contains CHEBI terms with asserted equivalence to swisslipid terms. Terms that are only in swisslipids nest under this. Our aim should be a module that contains only the swisslipids hierarchy with subclassOf relationships to CHEBI terms (as bare IRIs). This should be achievable with some ROBOT filter.
The text was updated successfully, but these errors were encountered: