Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive memory usage #7

Open
jibsch opened this issue Jun 23, 2017 · 5 comments
Open

Excessive memory usage #7

jibsch opened this issue Jun 23, 2017 · 5 comments

Comments

@jibsch
Copy link

jibsch commented Jun 23, 2017

Hi Owen,

I'm trying to run the pipeline on DamID-seq samples, but the runs cannot complete due to the server running out of memory.

Even extracting a single chromosome (dm6 chr3R) from the data will result in a memory blowout. The memory usage progresses non-linearly starting from processing the Dam-protein sample.
See the attached plot to illustrate this point.
image

Note that the data are paired-end reads.

The pipeline invocation is as follows:
~/tools/damidseq_pipeline-1.4/damidseq_pipeline --bowtie2_genome_dir=~/references/drosophila/dm6_bowtie2_index --gatc_frag_file=~/references/drosophila/dm6_GATC.gff --dam=Dam1_chr3R.bam DamXX_chr3R.bam

Regards,
Jan

@owenjm
Copy link
Owner

owenjm commented Jun 26, 2017

Hi Jan,

That definitely shouldn't happen. Are you still using a reference genome with unusual assembly names to perform the alignment, and did you manually build the GATC fragment file from that assembly if so?

Cheers,
Owen

@jibsch
Copy link
Author

jibsch commented Jun 26, 2017

I was suspecting that myself, so I trimmed down the reference.
The reference genome contains only the single chromosome.
I can run the same data in single-end mode and the issue does not arise, so it's very likely in the code relating to paired-end data.
Cheers,
Jan

@owenjm
Copy link
Owner

owenjm commented Jun 27, 2017

Interesting. I assume you're aligning the paired-end reads yourself (since the pipeline doesn't (yet) do this). Are you also aligning the single-end reads similarly, or letting the pipeline do it?

@jibsch
Copy link
Author

jibsch commented Jun 27, 2017 via email

@owenjm
Copy link
Owner

owenjm commented Jun 27, 2017

As it happens I've just generated some PE reads, so I'll look into this over the next day or so and see if I can reproduce the error. Paired-end compatibility has always been experimental, since most people will not have the money for PE seq (there's very little advantage to using PE reads in practice, in my experience at least, for most use-cases).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants