-
Notifications
You must be signed in to change notification settings - Fork 905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collection of tools and databases required for GTN that cannot be synchronized easily between the main servers #5391
Comments
Another issue are tools that are differently configured in TPV. E.g. quast on org cannot access the internet and therefore fails in this tutorial: https://training.galaxyproject.org/training-material/topics/microbiome/tutorials/metagenomics-assembly/tutorial.html |
Thanks for filing this @paulzierep ! Really appreciate it. cc @bgruening @natefoo @cat-bro since some databases/etc may be needed on each. |
i'm shocked that it needs internet access. That shouldn't be necessary :/ cc @jennaj |
I'm looking through a couple of these to see if we can analyse this problem statically but I fear we can't. E.g. the blastn link, the tutorial mentions a database, the workflow does not! instead it uses a connected input parameter that's empty. Same for Kraken in https://training.galaxyproject.org/training-material/topics/microbiome/tutorials/pathogen-detection-from-nanopore-foodborne-data/tutorial.html, they're empty input parameters. The test case for
In a subset of cases it is technically possible, but I am very afraid of false positives/negatives there, which means maybe we would have to restrict it to tools we know use databases, but i'm still not confident I'll say, since it requires such deep parsing of Galaxy's datastructures and API responses. Especially since there's no flag or signal (as far as I can tell) in the API responses that a specific parameter is a "database select" parameter that might vary between servers. If that was exposed, if we had a convenient way to know which parameters are "database selects", this problem would look a lot more tractable (albeit still with the cases of "workflow doesn't match tutorial and doesn't pre-select a DB") |
this should already be tested. we test for tools used in the tutorial / workflow. If you notice any bugs here please let me know! :) See the |
I installed the phyloseq IT on .org but don't have data to test it, can someone do that please? humann databases are updated as per the linked issue. kraken2, blastn, and pathogen detection DBs, any specific details about what is needed there? |
humann nucleotide and protein database: Add humann nucleotide and protein database for GTA training usegalaxy-tools#855 (missing on AU/ORG)
kraken2 - 5 toturials, multiple DBs - should all be updated to 2024 versions on the servers and the tutorials
blastn: https://training.galaxyproject.org/topics/assembly/tutorials/assembly-decontamination/tutorial.html
DBs for https://training.galaxyproject.org/training-material/topics/microbiome/tutorials/pathogen-detection-from-nanopore-foodborne-data/tutorial.html
phyloseq IT (only in EU)
Will try to add to this list step by step. Still need to check versions of the DBs on the servers if they exist.
The text was updated successfully, but these errors were encountered: