Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

normalize-by-median.py the total of the reads indicated in the report does not match the input #1916

Open
mainavienne opened this issue Mar 8, 2022 · 0 comments

Comments

@mainavienne
Copy link

I performed normalize-by-median.py and the total number of reads reported does not match the actual total reads in the input file.

my script :
interleave-reads.py --gzip -o sample1_paired.fastq.gz sample1_R1.fastq.gz sample1_R2.fastq.gz
normalize-by-median.py --gzip -M 700G -R sample1_norm.report -o sample1_paired_norm.fastq.gz -p sample1_paired.fastq.gz
split-paired-reads.py --gzip sample1_paired_norm.fastq.gz -1 sample1_norm_R1.fastq.gz -2 sample1_norm_R2.fastq.gz

The report file for the normalization says: DONE with sample1_paired.fastq.gz; kept 272104114 of 310860582 or 87.5% so a total of 310860582 reads but the real total reads in sample1_paired.fastq.gz is 312 017 576 reads.

I have no idea where this difference comes from. Is it normal? What can cause it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant