-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
number of joined reads drastic decrease #19
Comments
Could be an issue with supporting different read lengths? I have not tested that extensively. Does your pre-merging script do any trimming, or just complete discarding? Also have you checked that the read identifiers are still matched up after your quality filtering? Is it possible for example that reads were not thrown out in pairs? Thanks, On Aug 13, 2014, at 2:59 PM, brittbio [email protected] wrote:
|
Hi John, Thanks for the fast reply! The reads are all 151bp long (sorry there was an error in the note above when I said 149bp). The quality filtering completely discards reads, it does not trim them. So the reads are all still 149bp long. I don't believe the identifiers all match up, each forward and reverse read are filtered independently. So there might be some reads removed in the forward but still be present in the reverse. Does seqprep look for the matching identifiers? That being said, over 70% of the reads are still there post-quality filtering in both the forward and reverse so even if a few of the pairs are gone I would still expect more than 1052 reads to merge since pre-quality filtering I had over 9 million reads merged. Thank you so much for your help and the script! :-) Cheers, Brittany |
You definitely want to preserve matching. If SeqPrep doesn’t just error out when it finds mismatching reads, it may just attempt to keep matching them pairwise until it gets to the end of one of the files. On Aug 13, 2014, at 4:40 PM, brittbio [email protected] wrote:
|
I see, so SeqPrep uses the read location to merge them? OR it just matches reads systematically in the order they appear in the files? Thanks! Britt |
Just systematically in the order they appear in the file.
|
Hello,
Thank you so much for writing this program! :) I was wondering if you could help me with a problem I have come across.
I am trying to merge MiSeq data that are 149 bp Forward and reverse reads (no adaptors). I am using SeqPrep via the QIIME command join_paired_ends.py (http://qiime.org/scripts/join_paired_ends.html). I have successfully merged millions of reads with your command, thank you! And the number of reads merged is similar to that of other programs (fastq-join). HOWEVER, when I tried an approach in which I quality filtered the reads first very few reads were joined.
I filtered the reads with the fasts-toolkit (at least 75% of read had to have a minimum quality score of 25). After filtering I still had over 13million reads. Other merging programs (fastq-join) still merged a significant number of reads with these now quality filtered fastq files, however seqprep never merged more than 1052 reads.
Do you have any idea why this may be? Please let me know if you need more information to address this question.
Thank you!
Brittany
The text was updated successfully, but these errors were encountered: