Concordance Analysis Bug - Large Number of Partitions? #316

jasongallant · 2024-09-10T17:41:37Z

Hi There,

I'm trying to follow along this tutorial with my own data (http://iqtree.org/doc/recipes/concordance-vector)

I'm currently using the latest release (IQ-TREE multicore version 2.3.6 for Linux x86 64-bit built Aug 1 2024). I can successfully run this command:

iqtree2 -te astral_species_annotated.tree -p loci.best_model.nex --scfl 100 --prefix scfl -T 128

This example dataset contains 400 genes from a variety of bird species.

I'm trying to do something similar with about 25k genes. When I run this with the full dataset:

iqtree2 -te my_astral_species_annotated.tree -p my_loci.best_model.nex --scfl 100 --prefix scfl -T 128

I get this error:

Reading partition model file my_loci.best_model.nex ...
Reading "SETS" block...
terminate called after throwing an instance of 'std::__cxx11::basic_string<char, std::char_traits, std::allocator >'
ERROR: STACK TRACE FOR DEBUGGING:
ERROR:
ERROR: *** IQ-TREE CRASHES WITH SIGNAL ABORTED
ERROR: *** For bug report please send to developers:
ERROR: *** Log file: loci.best_model.repaired.nex.log
ERROR: *** Alignment files (if possible)
Aborted

However, If I manually edit the my_loci.best_model.nex to only include the first 10 genes, iqtree2 runs without issue. This causes me to suspect that this is related to the large number of partitions, however the program crashes nearly instantly. I'm running attempting this run on a machine with 128 processors and 2TB of RAM.

Any suggestions how to fix or proceed with this? Many thanks in advance!

jasongallant · 2024-09-10T17:59:14Z

I wrote a little python script that subsets the my_loci.best_model.nex randomly-- looks like somewhere between 200-400 sequences is the limit before it crashes?

jasongallant · 2024-09-10T18:02:58Z

For what its worth, this is the same type of analysis attempted in #155

roblanf · 2024-09-16T03:54:17Z

@thomaskf and @bqminh any ideas here?

@jasongallant, one option you could try is to use --scf instead. I appreciate this is not the same, but it might get you some useful information and/or help us track down the bug

jasongallant · 2024-09-16T12:55:43Z

Hi @roblanf - thanks for the reply, working with scf right now-- I also noted another issue #223 that affects tree calculations (noticed by @simone-says originally) in scfl. It has made the going tough, but it looks like scf is the way forward until this gets ironed out. let me know if I can provide more info on this end.

roblanf · 2024-09-17T00:47:04Z

Thanks for the cross-linking! As on the other thread, the most useful thing is a reproducible example if you have one, then as soon as one of us has time we can get straight to debugging.

thomaskf · 2024-09-24T01:59:05Z

Hi @jasongallant,
Thanks again for reporting the issue, and sorry for the delay. I have tested the program with a data set containing around 30K partitions, and it worked without any problems. Is it possible to share your data with us so we can investigate the issue further? If the dataset is too large, you may send a smaller subset where you encountered the error. Thank you very much!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concordance Analysis Bug - Large Number of Partitions? #316

Concordance Analysis Bug - Large Number of Partitions? #316

jasongallant commented Sep 10, 2024

jasongallant commented Sep 10, 2024

jasongallant commented Sep 10, 2024

roblanf commented Sep 16, 2024

jasongallant commented Sep 16, 2024

roblanf commented Sep 17, 2024

thomaskf commented Sep 24, 2024

Concordance Analysis Bug - Large Number of Partitions? #316

Concordance Analysis Bug - Large Number of Partitions? #316

Comments

jasongallant commented Sep 10, 2024

jasongallant commented Sep 10, 2024

jasongallant commented Sep 10, 2024

roblanf commented Sep 16, 2024

jasongallant commented Sep 16, 2024

roblanf commented Sep 17, 2024

thomaskf commented Sep 24, 2024