Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Information content of a custom GO annotation. #270

Open
ohdongha opened this issue May 2, 2023 · 1 comment
Open

Information content of a custom GO annotation. #270

ohdongha opened this issue May 2, 2023 · 1 comment

Comments

@ohdongha
Copy link

ohdongha commented May 2, 2023

Hello,
Thanks for developing and maintaining this amazing resource.

There is an instruction to obtain information content for each GO term in human, mouse, and fly GOA.

Could you also show how to calculate the information content for custom GOA in id2go format (i.e., gene ID in the 1st column and GO terms concatenated with ";" in the 2nd column) for a non-model species (i.e., not human, mouse, fly, etc.)?

I would like to compare custom GOAs created by different tools and see which had called more specific GO terms and the information contents of GO terms shared by both GOAs or unique to each GOA.

Thanks!
Dong-Ha

@ohdongha
Copy link
Author

ohdongha commented May 25, 2023

Could you also show how to calculate the information content for custom GOA in id2go format (i.e., gene ID in the 1st column and GO terms concatenated with ";" in the 2nd column) for a non-model species (i.e., not human, mouse, fly, etc.)?

I realize this could be achieved by adopting a code in a previous issue (#196):
#196 (comment)

I have a couple of questions below.
...

Q1. When I tried the code in #196, tests module was not recognized:

>>> from tests.utils import get_godag
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'tests'

Can I replace the following:

from tests.utils import get_godag
godag = get_godag('go-basic.obo')

... with this?

from goatools.obo_parser import GODag
godag = GODag("go-basic.obo")

Will these two do the same?

...

Q2. When I tried the codes posted in #196 (with the modification of the part to get godag), I received a warning at the step getting the tcntobj that says "GODAG is none":

>>> from goatools.anno.idtogos_reader import IdToGosReader
>>> from goatools.semantic import TermCounts
>>> from goatools.semantic import get_info_content
>>> from goatools.obo_parser import GODag
>>> godag = GODag("go-basic.obo")
go-basic.obo: fmt(1.2) rel(2023-03-06) 46,579 Terms
>>> annoobj = IdToGosReader("temp.id2go")
HMS:0:00:00.074716  21,312 annotations READ: temp.id2go 
>>> tcntobj = TermCounts(godag, annoobj.get_id2gos())
WARNING:root:IdToGosReader.get_id2gos: GODAG is None. IGNORING namespace(None). If you are running `map_to_slim.py`, this warning can be ignored.
8333 IDs in all associations
>>> go_id = 'GO:0004930'
>>> info_content = get_info_content(go_id, tcntobj)
>>> print('IC={IC:.05f} {GO} {NAME}'.format(GO=go_id, IC=info_content, NAME=godag[go_id].name))
IC=4.04149 GO:0004930 G protein-coupled receptor activity

What is the meaning of this warning, and can it be ignored?
Plus, I would appreciate it if you could confirm that the IC value printed here would be as the GOAtools functions intended (i.e., correct).

@dvklopfenstein, thank you in advance!

Best,
Dong-Ha

@ohdongha ohdongha changed the title Information content of a custom set of GO terms (or a custom GOA). Information content of a custom GO annotation. May 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant