Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parameter to exclude unbinned data from post-binning steps #708

Merged
merged 8 commits into from
Oct 30, 2024

Conversation

dialvarezs
Copy link
Contributor

@dialvarezs dialvarezs commented Oct 28, 2024

This pr adds a new parameter exclude_unbins_from_postbinning, with default value false.
The reasoning behind this is that as stated in #649, Prokka can take a really long time with unbinned data, and also qc metrics and taxonomic classification normally are not meaningful for non-bins.
With the default set to false, this parameter doesn't introduce any breaking change to the current workflow, but adds flexibility to focus on binned data on post-binning steps.

This PR also fixes one minor issue I found in line 827 of mag.nf, where the channel passed to GUNC was ch_input_bins_for_qc instead of ch_input_bins_for_gunc (the same, but without eukarya domain).

Closes #649 .

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/mag branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

Copy link
Member

@jfy133 jfy133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great idea!

I've tweaked the phrasing of the parameter and given more guidance to the parameter name.

Please update the CHANGELOG and I think then we might be good to go.

The only thing check is that the bin_summary steps don't fail for some reason because some contigs are missing for some reason

nextflow.config Outdated Show resolved Hide resolved
nextflow_schema.json Outdated Show resolved Hide resolved
nextflow_schema.json Outdated Show resolved Hide resolved
workflows/mag.nf Outdated Show resolved Hide resolved
@dialvarezs
Copy link
Contributor Author

dialvarezs commented Oct 30, 2024

Thank you for the review! I believe everything should be good to go now.

As for a possible bin summary failure, I don’t think it will case issues, because DEPTHS and every other bin summary input uses the same channel. I did a test with a few samples and didn't have any problem.

On that note, I think it should be worth considering making bin summary more flexible, allowing it to handle cases where not all inputs have identical bins. So, if a QC bin fails, we could generate the table with a warning rather than an error. This should prevent issues like #603, but obviously, if a QC process fails the user should have to look into it. However, this could be a discussion to have in another PR.

Copy link
Member

@jfy133 jfy133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much @dialvarezs !

nextflow_schema.json Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@jfy133 jfy133 merged commit cd5ebae into nf-core:dev Oct 30, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants