Crash during alignment stage after 3.1 update #57

anearman · 2024-11-20T17:11:55Z

Hello! I was running 3.0 almost successfully yesterday but it crashed during gnomon training so I updated to 3.1. Today it appears to not make it through the alignment stages with the following error:

ERROR ~ Error executing process > 'egapx:rnaseq_short_plane:star:run_star (4)'

Caused by:
Process egapx:rnaseq_short_plane:star:run_star (4) terminated with an error exit status (3)

I'm attempting to annotate a small trypanosomid genome (~34MB) with a proteome and ample RNAseq data. I was able to execute the example files with no problems for 3.0, though I haven't checked for 3.1. Attached are the various log files.

Thank you!

issue.zip

The text was updated successfully, but these errors were encountered:

victzh · 2024-11-20T21:15:04Z

I tried to replicate the issue, and found that we don't have proteins for the taxonomy branch of your sequence. I see that you supplied the proteins yourself, but as far as I can tell it does not work as reliably as if you have the taxonomy branch covered by us. If you can provide me with a link to download proteins I can try to replicate it again.

Thanks,

Victor.

anearman · 2024-11-20T21:26:29Z

Hi Victor,

Thanks for getting back! Attached is the proteome I used. Originally I had a collection of UniProt formatted proteins for all Trypanosomatids (~500MB) but that proved to be too much, so I restricted it to a functional annotation I previously performed for this species.

LpasUniProt.fasta.gz

Side note, these organisms don't have introns, is there a way to account for this in the annotation process?

victzh · 2024-11-21T00:10:04Z

Thanks, I will try it again with your protein data. About introns - I don't know, but I will ask my colleagues. It should have a way - we annotate many different kinds of organisms. I will ask around.

murphyte · 2024-11-21T15:54:53Z

protists including Trypanosomes are currently out-of-scope for EGAPx, as stated on the home page. It's not just a matter of the protein sets -- we need to do additional development to adequately support protists and fungi. It's on our roadmap, but it'll likely be a while before we are ready to support Trypanosomes.

That doesn't explain the run_star failure. We do have logic to automate selection of max_intron size, and that logic is not set up for Trypanosomes, so it may be picking an unusual value that might cause failures.

victzh · 2024-11-21T16:02:36Z

I did have a run with your proteins and NCBI's version of sequence and SRA reads. It ran through STAR successfully, and even memory requirements for it were not extreme. It failed later for me suggesting that the sequence have too many similarities to proks, so it is probably contaminated.

But anyways as already mentioned, we don't support this taxonomy branch yet, so even if it runs successfully after using our another product, FCS (Foreign Contamination Screening) the results are not going to be valid.

anearman · 2024-11-21T16:04:31Z

Yeah, I did see the lack of support Trypanos but wanted to see if I could sneak it through anyway. Perhaps this is why I was seeing a slightly different failure in v3.0 for gnomon training. I have a full annotation for this species already but having a not fun time getting it to table2asn standards.

Mostly this was an easy (hopefully) test run before trying to push through several much larger genomes with our consortium project. We'll likely have to run most of those on the HPC, but it would be nice to be able to do some of the smaller ones locally.

I can rerun the example files to see if the run_star failure persists for v3.1 and report back if you think that will be helpful.

anearman · 2024-11-21T16:05:13Z

I did have a run with your proteins and NCBI's version of sequence and SRA reads. It ran through STAR successfully, and even memory requirements for it were not extreme. It failed later for me suggesting that the sequence have too many similarities to proks, so it is probably contaminated.

But anyways as already mentioned, we don't support this taxonomy branch yet, so even if it runs successfully after using our another product, FCS (Foreign Contamination Screening) the results are not going to be valid.

Thanks, Victor! Any thoughts as to why I'm having the run_star failure with 3.1 and did not have it with 3.0?

victzh · 2024-11-21T18:30:39Z

It maybe an accidental fault in STAR, the error of this kind can happen if STAR failed and samtools can't read a full data chunk. On the other hand, it should be retried and if it is just a fluke it should complete. I don't see theese retries in your run.trace.txt file. Can you send me the config file you have for Singularity, please? And what are the parameters of machine you're running it on, CPUs, RAM?

anearman · 2024-11-22T15:49:40Z

If I remember correctly, I tried to continue after the first time it failed, then it failed again, so I deleted everything in the working directory, deleted the project directory, and started everything fresh after a restart and general update check.

I set the docker config to 31 CPUs and 120 GB RAM and then set a 20GB swap. When running on 3.0, there seemed to be no problems until the end after completing ~480 tasks. The only thing that might be strange is that I have Nextflow installed as a mamba environment stacked on the python environment for egapx, but it seemed to work fine until 3.1 was installed. The only other thing I found was my samtools version was slightly out of date, so I just updated.

I didn't change anything in the Singularity config file so it just says:
singularity.enabled = true

I did edit the docker config file to:
docker.enabled = true
process {
memory = 120.GB
cpus = 31
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crash during alignment stage after 3.1 update #57

Crash during alignment stage after 3.1 update #57

anearman commented Nov 20, 2024

victzh commented Nov 20, 2024

anearman commented Nov 20, 2024

victzh commented Nov 21, 2024

murphyte commented Nov 21, 2024

victzh commented Nov 21, 2024

anearman commented Nov 21, 2024

anearman commented Nov 21, 2024

victzh commented Nov 21, 2024

anearman commented Nov 22, 2024

Crash during alignment stage after 3.1 update #57

Crash during alignment stage after 3.1 update #57

Comments

anearman commented Nov 20, 2024

victzh commented Nov 20, 2024

anearman commented Nov 20, 2024

victzh commented Nov 21, 2024

murphyte commented Nov 21, 2024

victzh commented Nov 21, 2024

anearman commented Nov 21, 2024

anearman commented Nov 21, 2024

victzh commented Nov 21, 2024

anearman commented Nov 22, 2024