Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve SBO term handling #469

Open
2 tasks
ChristianLieven opened this issue Jul 12, 2018 · 10 comments
Open
2 tasks

Improve SBO term handling #469

ChristianLieven opened this issue Jul 12, 2018 · 10 comments
Assignees

Comments

@ChristianLieven
Copy link
Contributor

ChristianLieven commented Jul 12, 2018

Problem description

We're really specifically checking for a few node SBO terms and not yet for the presence of any children of those nodes. Ideally, we'd have some internal representation of the ontology such that we can check if the the SBO term of an object matches a specific node term or any of its children.

We should also remove the duplicate check for SBO term annotations in our annotations tests as per this issue #447

  • Refactor tests and helper functions such they make use of the full SBO Ontology
  • Explicitly exclude SBO terms from tests using the annotation attribute.
@Midnighter
Copy link
Member

Very much agreed. I'd like to see the SBO handled in cobrapy, though.

@gregmedlock
Copy link
Member

To get the discussion going about what this will look like for cobrapy and memote, how do you envision cobrapy users interacting with SBO terms? There are many cases that automatic assignment makes sense, e.g. all metabolites should get SBO:0000247 (simple chemical entity), but these still have exceptions that make automation dangerous (e.g. polymers shouldn't have SBO:0000247).

Would a set of SBO helper functions be appropriate, such as functions that flag model elements as candidates for SBO terms and then let the user add the terms manually? Or is this getting into the realm of a package dedicated to annotating cobrapy models with SBO terms?

On the memote side, it seems like cobrapy-based "checks" could clean things up, e.g. calling cobra.annotation.sbo.check_element() to see whether an SBO term is being used properly, to the extent that it can be automatically assessed. If the entirety of SBO was integrated within cobrapy, this could also be done at the time of assignment (e.g. raise a warning if a non-transport reaction is assigned to a transport-related SBO term). But, the checks within memote would still be necessary because other packages probably wouldn't implement the practice.

@Midnighter
Copy link
Member

Midnighter commented Jan 24, 2019

I like where you're going in terms of functionality.

Making it easy for a model reconstruction pipeline to mostly do the right thing is definitely a worthy goal.

Would a set of SBO helper functions be appropriate, such as functions that flag model elements as candidates for SBO terms and then let the user add the terms manually? Or is this getting into the realm of a package dedicated to annotating cobrapy models with SBO terms?

In terms of implementation, the first step in my opinion is to find a good package to work with ontologies. There are a couple of options likely candidates are ontobio, owlready, ontospy. Then one could provide a couple of convenience functions around the ontology package. We had done some preliminary tests but discarded using any of them because none of them supports Python 2 & 3. I would start with SBO annotation being part of the main cobrapy package but that depends a bit on the desired feature set.

The last cobrapy user survey encouraged me to drop support for Python 2 which would then immediately make any of those ontology packages targets to build on. Overall, I'm afraid good SBO support is a few months down the line, though. There are a number of things to wrap up on the cobrapy side plus a big clean up of the codebase to make it Python 3 only.

Fully agree with your last paragraph. Overall it will really help if we can get as clear as possible a picture of the different use-cases involving the SBO. I wonder who else is really interested in this, it'd be great to nail down some requirements with input from different parties.

@famosab
Copy link

famosab commented Sep 15, 2021

It seems to me that my issue ( #721) corresponds to this discussion aswell. Would it make sense to use libSBML to access the SBO tree? If I recall correctly, there is at least a possibility with JSBML so it might be possible with libSBML aswell.

JSBML offers a function called getParentTerms() which allows to access SBO terms which are more general. So far, libSBML only seems to support checking for certain ''identities'' so whether a term is for example a Conservation Law.

@famosab
Copy link

famosab commented Sep 15, 2021

I commented the difference between JSBML and libSBML implementation of the SBO class on their GitHub: Implementation of SBO term trees Issue number 164 (#164). Maybe that helps.

@Midnighter
Copy link
Member

I commented the difference between JSBML and libSBML implementation of the SBO class on their GitHub: Implementation of SBO term trees Issue number 164. Maybe that helps.

Can you provide a link for that, please?

@matthiaskoenig
Copy link
Collaborator

Here the issue: sbmlteam/libsbml#164

@Midnighter
Copy link
Member

Sounds like storing the SBO in a file distributed with cobrapy and using a package to conveniently access the ontology is the way to go for now.

@draeger
Copy link

draeger commented Sep 16, 2021

When storing the SBO file within a distribution of COBRApy, it is crucial to make it exchangeable. The SBO is still growing. New subterms may be added, and all of a sudden, COBRApy may raise errors when users want to check a seemingly non-existing term. Some users might not be able to upgrade their version of COBRApy for various reasons. This will be a great advantage if they have a relatively easy way of simply exchanging the SBO file distributed with COBRApy. Also, if it isn't exchangeable, a new release of COBRApy could become necessary with each new version of SBO.
Nice could also be an automated download of the latest SBO file (OWL format) upon building COBRApy. We have implemented that in the build process of JSBML also. Hence, each new JSBML build will automatically include the latest SBO file at the time of building.
Please also note that there are already implementations of ontology parsers available in Python. Perhaps, it might not even be such a considerable effort to support the directed acyclic SBO graph queries.

@Midnighter
Copy link
Member

Midnighter commented Sep 16, 2021

When storing the SBO file within a distribution of COBRApy, it is crucial to make it exchangeable. The SBO is still growing. New subterms may be added, and all of a sudden, COBRApy may raise errors when users want to check a seemingly non-existing term. Some users might not be able to upgrade their version of COBRApy for various reasons. This will be a great advantage if they have a relatively easy way of simply exchanging the SBO file distributed with COBRApy. Also, if it isn't exchangeable, a new release of COBRApy could become necessary with each new version of SBO.

Those are great points. With eQuilibrator, we've put data files on Zenodo and then load/update by DOI. That's been a pretty decent experience so far. Added bonus is that one could say model was created with SBO version x and DOI. It'd actually be great if the SBO was released directly to Zenodo rather than us doing it.

Please also note that there are already implementations of ontology parsers available in Python. Perhaps, it might not even be such a considerable effort to support the directed acyclic SBO graph queries.

Agree completely. I noted a few options above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants