-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to get current release (41) data of fugiDB through annotationhub #5
Comments
Greetings! The EuPathDB AnnotationHub resources will be updated with each new Bioconductor release (which in turn, follows the R release cycle), so the updated versions should be available sometime around April - May. If you want to access the resources sooner, you can also use this package to generate the resources locally by modifying the scripts to only build resources for your species(s) of interest. Cheers, |
Hi, make_eupath_orgdb(species="substring_of_species_name", webservice="fungidb") Hopefully there are useful examples in the vignette. |
Thanks @khughitt and @abelew . Very useful. As @abelew mentioned, i ran the command. I attached the log and command below. However, It seems like download failed.
|
Hi, |
Ah it turns out that either the eupathdb folks changed how they return a gene with no orthologs, or I did not account for it properly. But either way it seems to be running happily now. I think it will finish generating the orgdb in the next 10-20 minutes. I will upload it to: |
Sorry I took so long, I uploaded a tarball of the orgdb package. On my machine at least, it installed via r cmd install without shenanigans. After installing it, I used load_eupath_annotations() and got back a data frame with 20,367 rows and 78 columns. Full details are in test_001_anidulans.R. |
I downloaded the tarball from the link you gave. But, I couldn't find the script |
Ah yeah, that is in my fork of the repository. I need to PR it. With respect to the set of all species, I have been meaning to do that in preparation for the next release but haven't. I will start that process now while I am thinking about it. But to answer your question, yes. The scripts/ directory in the package contains the scripts Keith and I wrote for that purpose. With that in mind, I am queuing up a generator for the rest of the fungidb now on our cluster. I think I will try parallelizing it and see if I make the eupathdb webservers sad. |
I have the set of all fungidb packages generating now. |
An update: |
The 140 species from FungiDB have finished. I am now generating the rest of the eupathdb (227 more to go!). The full set of packages are 14G. I am copying them now to: If you get the itch to grab them, please give me a heads up when you are finished I will clear it out and kill the web server. |
Hi @abelew Big thanks !! I have one question. Can you tell me which the release number of fungi db you have used to generate all the data ? The latest release is 41. Also, can I download them through annotationhub ? or I have to download each of them individually ? |
I compared org.Anidulans.FGSC.A4.v42.eg.db vs v39 obtained from annotationHub. 42 seems much updated. However, I have one concerns if you can fix. v42 has several new columns added to orgdb compared to 39. Some of them have just different column names but content is same. You can see them in table below. I wonder, if you keep them consistent depended package will not break.
One more thing is gene description column. In v39 it was given under name |
I tried to from the link you provided. 15 of them couldn't download.
|
Why there is no GRanges object in the link you provided ? |
I will respond in reverse order because I get confused. With respect to the weird download errors, Apache's mod-negotiation was confused by those filenames and thought the '.var' in the filenames was telling it that they were another language, I just disabled mod-negotiation and they should work fine now. Finally, your first queries: These are eupathdb release 42 packages. The column names are an interesting concern. I changed them because there are duplicate columns in different tables with the same name. In order to avoid the resulting collision, I prefixed the column names with their home table. Thus you will find columns with the following prefixes (and their source):
Finally, I think the eupathdb folks renamed the gene description column to 'gene_product', and as such you will find it under 'ANNOT_GENE_PRODUCT'; in addition, the 'ANNOT_PFAM_DESCRIPTION' column might be of interest. Oh, I skipped one other question: These packages are not yet available in AnnotationHub. I am hoping to learn how to upload them shortly from the AnnotationHub folks; though I have not yet finished generating all of the other eupathdb packages, so it will need to wait until those are complete (probably later today or tomorrow, last I looked it had finished generating 280 of 346). I hope this helps. |
The rda GRanges files are copied. I ended up regenerating a few that I accidentally deleted. |
Do you mind if I close the issue? I think that if you are ok with the changed column names, then everything is complete. |
Sure. Thanks a lot for your support. I really appreciate. |
Hi, Sorry for troubling you again. As you mentioned previously, latest fungiDB data will be added to AnnotationHub with new bioconductor release. The new bioconductor release (3.9) is out now. So I updated both bioconductor and AnnotationHub. The fungiDB data available in the latest release is still V39. Another surprising this is, there is no OrgDB object available in AnnotationHub now. As you can see below rdataclass just contains GRanges and no other objects. Can you throw some light on this discrepancy ?
Thanks a lot . |
Greetings, |
Hi, Thanks for prompt reply. Anyway, I have V42 data which you uploaded previously. I can wait for few days if it is available through AnnotationHub as it will help me to maintain my downstream code. If needed I will ask you. Thanks a lot, |
Hi, Now I can see OrgDb and GRanges object from fungidb (V42) in AnnotationHub, which is fantastic. However, I cannot download OrgDB but GRanges can. I submitted the issue on AnnotationHub. If you can look into and resolve. Many thanks |
Hi,
I am using annotation hub to use fungidb data. All the fungidb data (object class : OrgDB and GRages)
are of fungidb-release 39. However, current release of fungi db is 41. I wonder, how can I use fungidb release 41 data through annotation hub.
Thanks.
The text was updated successfully, but these errors were encountered: