Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fqtk as a demultiplexer #99

Merged
merged 4 commits into from
Mar 16, 2023
Merged

Conversation

sam-white04
Copy link
Collaborator

@sam-white04 sam-white04 commented Mar 8, 2023

Hello,

I have added fqtk as an optional demultiplexer. The tool fqtk requires an additional input to be provided as a path in the 5th column of --input samplesheet.csv.

Thank you,
Samantha White

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs- [x] If necessary, also make a PR on the nf-core/demultiplex branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated. (@sam-white04 Will update as soon as PR is posted)
  • README.md is updated (including new tool citations and authors/contributors).

@github-actions
Copy link

github-actions bot commented Mar 8, 2023

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 9b611b2

+| ✅ 155 tests passed       |+
#| ❔   1 tests were ignored |#
!| ❗   4 tests had warnings |!

❗ Test warnings:

  • pipeline_todos - TODO string in README.md: Add full-sized test dataset and amend the paragraph below if applicable
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your prefered methods description, e.g. add publication citation for this pipeline
  • pipeline_todos - TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required
  • schema_description - No description provided in schema for parameter: skip_tools

❔ Tests ignored:

✅ Tests passed:

Run details

  • nf-core/tools version 2.7.2
  • Run at 2023-03-15 15:25:00

@edmundmiller edmundmiller added this to the 1.2.0 milestone Mar 8, 2023
@edmundmiller edmundmiller linked an issue Mar 8, 2023 that may be closed by this pull request
@edmundmiller
Copy link
Collaborator

edmundmiller commented Mar 8, 2023

I don't mind them being squashed but that commit message 1. Has nothing to do with fqtk 2. Is from another PR and I think we from a rebase that didn't work out as planned.

@edmundmiller edmundmiller changed the title fqtk additional tool Add fqtk as a demultiplexer Mar 8, 2023
@sam-white04 sam-white04 requested a review from nh13 March 13, 2023 17:58
edmundmiller and others added 3 commits March 14, 2023 16:49
Co-authored-by: ewels <[email protected]>

 This is a combination of 3 commits.

Fqtk off of dev

Remove todo's from fqtk_demultiplex.nf

Update demultiplex.nf

Update README.md

Update input file paths
Update CHANGELOG.md
Copy link
Collaborator

@edmundmiller edmundmiller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few small comments and style changes, but looks pretty good!

README.md Outdated Show resolved Hide resolved
subworkflows/local/fqtk_demultiplex/main.nf Outdated Show resolved Hide resolved

rg.ID = [fcid,lane].join(".")
rg.PU = [fcid, lane, index].findAll().join(".")
rg.PL = "SINGULAR"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be hard-coded?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emiller88 Good point, it probably should not be hardcoded. How else could it be filled in?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be a way to read in the read group. We can just throw a TODO on it though for now.

workflows/demultiplex.nf Show resolved Hide resolved
Comment on lines +390 to +423
def extract_csv_fqtk(input_csv) {

// Flowcell Sheet schema
// Possible values for the "content" column: [meta, path, number, string, bool]
def input_schema = [
'columns': [
'id': [
'content': 'meta',
'meta_name': 'id',
'pattern': '',
],
'samplesheet': [
'content': 'path',
'pattern': '^.*.csv$',
],
'lane': [
'content': 'meta',
'meta_name': 'lane',
'pattern': '',
],
'flowcell': [
'content': 'path',
'pattern': '',
],
'per_flowcell_manifest': [
'content': 'path',
'pattern': '',
]
],
required: ['id','flowcell', 'samplesheet', 'per_flowcell_manifest'],
]

return extract_csv(input_csv, input_schema)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are starting to get excessive. @matthdsm what happens if we put them in lib/? Would they still work the same?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emiller88 Moving the prep for ch_flowcells into subworkflows would remove the need for these functions to be in demultiplex.nf, right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, that's a great idea!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've no qualms putting all the functions in /lib. They're only in the main file for convenience!

Copy link
Collaborator Author

@sam-white04 sam-white04 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emiller88 Thank you for your comments. If everyone is on the same page, Im happy to pull the generation of ch_flowcells and ch_flowcells_tar into subworkflows for fqtk vs the other demultiplexers.


rg.ID = [fcid,lane].join(".")
rg.PU = [fcid, lane, index].findAll().join(".")
rg.PL = "SINGULAR"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emiller88 Good point, it probably should not be hardcoded. How else could it be filled in?

workflows/demultiplex.nf Show resolved Hide resolved
Comment on lines +390 to +423
def extract_csv_fqtk(input_csv) {

// Flowcell Sheet schema
// Possible values for the "content" column: [meta, path, number, string, bool]
def input_schema = [
'columns': [
'id': [
'content': 'meta',
'meta_name': 'id',
'pattern': '',
],
'samplesheet': [
'content': 'path',
'pattern': '^.*.csv$',
],
'lane': [
'content': 'meta',
'meta_name': 'lane',
'pattern': '',
],
'flowcell': [
'content': 'path',
'pattern': '',
],
'per_flowcell_manifest': [
'content': 'path',
'pattern': '',
]
],
required: ['id','flowcell', 'samplesheet', 'per_flowcell_manifest'],
]

return extract_csv(input_csv, input_schema)
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emiller88 Moving the prep for ch_flowcells into subworkflows would remove the need for these functions to be in demultiplex.nf, right?

Copy link
Collaborator

@edmundmiller edmundmiller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good to me. #102 For follow up to clean up some things that needed to be introduced here.

@edmundmiller
Copy link
Collaborator

If you rerun the CI enough, it works 🙃

@edmundmiller edmundmiller merged commit f5ae6bf into nf-core:dev Mar 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

For sample demultiplexing from FASTQs, add fqtk.
3 participants