-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-tool functionality and subworkflows as hub of methods #385
Comments
Regarding the "Toolsheet", how does that relate to what we proposed in #362? |
The toolsheet is to decide which DE and functional analysis methods to run. An example is here. This is the default toolsheet where each row is a combination of tools that would make sense to be together. The idea is that the user can select for example As for your question, the |
I'm wondering if it wouldn't be more convenient to specify everything in yaml format? Essentially each list item would replace one row in your toolsheet and everything could be specified in one place. YAML seems the more natural choice to me in cases where you have a lot of empty columns in a CSV file otherwise and/or lists of things such as I'm also afraid that all the parameters for a differentialabundance run get scattered across too many places... nextflow params, contrasts file, toolsheet file, samplesheet... I'd rather reduce the number of places where to specify parameters. Something like: models:
- method: limma
formula: ~ treatment + response
contrasts:
- id: treatment_a_vs_b
type: simple
comparison: ["treatment", "A", "B"]
enrichment:
- gsea
- gprofiler2
- method: propd
permutations: 100
contrasts:
- id: treatment
type: anova
column: treatment
- compositional: propr
metric: rho This obviously needs to be fleshed out in more detail. For this it would be important to understand which of the workflows depends on each other. I guess the compositional workflow is completely separate from the differential workflow. The enrichment workflow could be independent when working on the expression data, but it could also work off a ranked gene list generated by the differential workflow. |
I don't have too much of a strong feeling between yaml or csv format. However, merging contrast with toolsheet into one file could become tricky. This is because, when there are many methods available, it is nice to have a 'default' toolsheet as a place to specify all the possible combinations of tools that really make sense to be together from the theoretical perspective. This file will always be there, in the pipeline github. Whereas the contrast file is data specific. |
What are the implications of this? Would you fail the pipeline if a user specifies an "invalid" combination? |
Don't have a plan for that yet, but one option is to raise a warning that it is a non-tested combination. Indeed, for benchmark users, we considered the possibility of providing an extra toolsheet with all the rows one wants to benchmark. |
This is also a concern for us... but for the moment we have not find a better solution. It would be nice to brainstorm at some point and super welcome to contribute if you find a better way :) |
Just to clarify again, this will only be in the pipeline and the user specifies the combination of tools using standard params, e.g. |
We defined |
No, it's all good then. All I wanted to know is that in a standard pipeline run, the user wouldn't be required to specify yet another config file. As you said, we should still think about how to reduce the number of places where to specify parameters, but that's a topic for a separate issue. |
Here I created a meta issue with all the steps/sub-issues needed to achieve what we agreed to do. |
Since the tool sheet will be read with nf-schema, it can accept both CSV and YAML, so a user could use the one that is more convenient for them. |
Actually @mirpedrol , if it is in yaml format, does it mean that it would be more flexible, and better allow definitions of optional methods/params? |
I would say they are equivalent if we use simple YAML (without nesting), up to a user preference which one is easier to type. |
Goals
Context
There were some effort done in the branch dev-ratio to explore these options.
Now the plan is to break down the work into small pieces, clean code, and PR to
dev
.Steps needed
TABULAR_TO_GSEA_CHIP
to nf-core/modules modules#7200Other related features
The text was updated successfully, but these errors were encountered: