-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Excessive memory usage #7
Comments
Hi Jan, That definitely shouldn't happen. Are you still using a reference genome with unusual assembly names to perform the alignment, and did you manually build the GATC fragment file from that assembly if so? Cheers, |
I was suspecting that myself, so I trimmed down the reference. |
Interesting. I assume you're aligning the paired-end reads yourself (since the pipeline doesn't (yet) do this). Are you also aligning the single-end reads similarly, or letting the pipeline do it? |
The single end reads are aligned prior to the pipeline as well, using
bowtie 2.
…On 27 Jun 2017 6:06 pm, "Owen Marshall" ***@***.***> wrote:
Interesting. I assume you're aligning the paired-end reads yourself (since
the pipeline doesn't (yet) do this). Are you also aligning the single-end
reads similarly, or letting the pipeline do it?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#7 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFEorh-5IfS4VF5qV87Mmka0o7d2kMQBks5sILgcgaJpZM4ODH7c>
.
|
As it happens I've just generated some PE reads, so I'll look into this over the next day or so and see if I can reproduce the error. Paired-end compatibility has always been experimental, since most people will not have the money for PE seq (there's very little advantage to using PE reads in practice, in my experience at least, for most use-cases). |
Hi Owen,
I'm trying to run the pipeline on DamID-seq samples, but the runs cannot complete due to the server running out of memory.
Even extracting a single chromosome (dm6 chr3R) from the data will result in a memory blowout. The memory usage progresses non-linearly starting from processing the Dam-protein sample.
See the attached plot to illustrate this point.
Note that the data are paired-end reads.
The pipeline invocation is as follows:
~/tools/damidseq_pipeline-1.4/damidseq_pipeline --bowtie2_genome_dir=~/references/drosophila/dm6_bowtie2_index --gatc_frag_file=~/references/drosophila/dm6_GATC.gff --dam=Dam1_chr3R.bam DamXX_chr3R.bam
Regards,
Jan
The text was updated successfully, but these errors were encountered: