-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add fqtk as a demultiplexer #99
Conversation
|
I don't mind them being squashed but that commit message 1. Has nothing to do with fqtk 2. Is from another PR and I think we from a rebase that didn't work out as planned. |
Co-authored-by: ewels <[email protected]> This is a combination of 3 commits. Fqtk off of dev Remove todo's from fqtk_demultiplex.nf Update demultiplex.nf Update README.md Update input file paths
Update CHANGELOG.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few small comments and style changes, but looks pretty good!
|
||
rg.ID = [fcid,lane].join(".") | ||
rg.PU = [fcid, lane, index].findAll().join(".") | ||
rg.PL = "SINGULAR" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be hard-coded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@emiller88 Good point, it probably should not be hardcoded. How else could it be filled in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There might be a way to read in the read group. We can just throw a TODO on it though for now.
def extract_csv_fqtk(input_csv) { | ||
|
||
// Flowcell Sheet schema | ||
// Possible values for the "content" column: [meta, path, number, string, bool] | ||
def input_schema = [ | ||
'columns': [ | ||
'id': [ | ||
'content': 'meta', | ||
'meta_name': 'id', | ||
'pattern': '', | ||
], | ||
'samplesheet': [ | ||
'content': 'path', | ||
'pattern': '^.*.csv$', | ||
], | ||
'lane': [ | ||
'content': 'meta', | ||
'meta_name': 'lane', | ||
'pattern': '', | ||
], | ||
'flowcell': [ | ||
'content': 'path', | ||
'pattern': '', | ||
], | ||
'per_flowcell_manifest': [ | ||
'content': 'path', | ||
'pattern': '', | ||
] | ||
], | ||
required: ['id','flowcell', 'samplesheet', 'per_flowcell_manifest'], | ||
] | ||
|
||
return extract_csv(input_csv, input_schema) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are starting to get excessive. @matthdsm what happens if we put them in lib/
? Would they still work the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@emiller88 Moving the prep for ch_flowcells
into subworkflows would remove the need for these functions to be in demultiplex.nf
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that's a great idea!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've no qualms putting all the functions in /lib
. They're only in the main file for convenience!
Co-authored-by: Edmund Miller <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@emiller88 Thank you for your comments. If everyone is on the same page, Im happy to pull the generation of ch_flowcells
and ch_flowcells_tar
into subworkflows for fqtk vs the other demultiplexers.
|
||
rg.ID = [fcid,lane].join(".") | ||
rg.PU = [fcid, lane, index].findAll().join(".") | ||
rg.PL = "SINGULAR" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@emiller88 Good point, it probably should not be hardcoded. How else could it be filled in?
def extract_csv_fqtk(input_csv) { | ||
|
||
// Flowcell Sheet schema | ||
// Possible values for the "content" column: [meta, path, number, string, bool] | ||
def input_schema = [ | ||
'columns': [ | ||
'id': [ | ||
'content': 'meta', | ||
'meta_name': 'id', | ||
'pattern': '', | ||
], | ||
'samplesheet': [ | ||
'content': 'path', | ||
'pattern': '^.*.csv$', | ||
], | ||
'lane': [ | ||
'content': 'meta', | ||
'meta_name': 'lane', | ||
'pattern': '', | ||
], | ||
'flowcell': [ | ||
'content': 'path', | ||
'pattern': '', | ||
], | ||
'per_flowcell_manifest': [ | ||
'content': 'path', | ||
'pattern': '', | ||
] | ||
], | ||
required: ['id','flowcell', 'samplesheet', 'per_flowcell_manifest'], | ||
] | ||
|
||
return extract_csv(input_csv, input_schema) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@emiller88 Moving the prep for ch_flowcells
into subworkflows would remove the need for these functions to be in demultiplex.nf
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good to me. #102 For follow up to clean up some things that needed to be introduced here.
If you rerun the CI enough, it works 🙃 |
Hello,
I have added fqtk as an optional demultiplexer. The tool fqtk requires an additional input to be provided as a path in the 5th column of
--input samplesheet.csv
.Thank you,
Samantha White
PR checklist
nf-core lint
).nextflow run . -profile test,docker --outdir <OUTDIR>
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated. (@sam-white04 Will update as soon as PR is posted)README.md
is updated (including new tool citations and authors/contributors).