New bulk 5'-RACE supported protocol, non-overlaping reads rescue #343
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds :
library_generation_method
to allow the analysis of 5'RACE library where R1 reads not start directly by the UMI by adding a new processPRESTO_MASKPRIMERS_ALIGN_TRIM
that launchMaskPrimers.py align
in trim mode beforePRESTO_MASKPRIMERS_UMI
process.--assemblepairs_join
to allow non-overlapping reads to be rescued usingassemblepairs join
on failed reads fromassemblepairs align
. In fact, in our libraries we have a large proportion of reads which do not overlap, but which turn out to be detected as productive sequences at the end of the pipeline.This PR doesn't add it at the moment but is it possible to have in options the possibility of opting for IgBlast's 19-column mode ?
enhancement #342
PR checklist
Do I need to add tests, and if so, can you tell me how?
So far I've been testing with real data sets from my research lab, should I add a test data set on nf-core/airrflow branch on the nf-core/test-datasets repository ? If so, I'll check with my team to see what I can provide.
nf-core lint
).LookupError: Failed to clone from the remote:
https://github.com/nf-core/modules.git``nextflow run . -profile test,docker --outdir <OUTDIR>
).nextflow run . -profile debug,test,docker --outdir <OUTDIR>
).docs/usage.md
is updated.I'm new to analyzing this type of data, so I'm not familiar with the various AIRR library generation methods. Can you help me to name the new supported protocols, so far I've called it
specific_5p_race_umi
but I'm not sure it's the right way to name it.Output Documentation in
docs/output.md
is updated.Have two new output folders, :
presto/trim_upstream_umi_linker
to store R1 reads where the UMI upstream sequence was trim.presto/08-assemble-pairs-join
if the new--assemblepairs_join
option is enabled.Are they well named to update docs/output.md ?
CHANGELOG.md
is updated.