-
Beta Was this translation helpful? Give feedback.
Replies: 6 comments 1 reply
-
There are two questions:
While they are mathematically equivalent, they are not always the same in a computer due to rounding errors (there are many articles about this on the web). As an example, let's say we have an alignment with 10 sites. The tree log-likelihood is the sum over site log-likelihoods: BONUS:
I believe this is the explanation for a paper (https://doi.org/10.1038/s41467-020-20005-6) about irreproducibility in IQ-TREE and RAxML. However, you don't really have to worry about this, because it happens only for dataset with little phylogenetic information. But for such dataset, there is much bigger issue in the reliability of the trees (e.g. low support values) and users need to do multiple runs with different seed numbers anyway and other analyses to rectify the results. Hope that helps. |
Beta Was this translation helpful? Give feedback.
-
PS: At the beginning of the log file, IQ-TREE print several lines like this:
The results are only reproducible if |
Beta Was this translation helpful? Give feedback.
-
Beautiful explanation! Thank you!
My case was precisely this. I was using alternative MSAs to estimate the confidence interval of a metric that depends on the tree topology. When I decided to double check if the variance was due only to the MSA or also to the phylogeny reconstruction, I ended up finding that there's variation in the tree topology even with a constant seed. This isn't an issue to me, since I was precisely measuring variance, but I decided to open up a bug anyway because (1) I feel that it could be beneficial to users if this was mentioned in the documentation somewhere (as there's little explanation about the seed parameter); (2) wanted to have this discussed somewhere in case someone googles this in the future. |
Beta Was this translation helpful? Give feedback.
-
Since this isn't an issue, but expected behaviour (noting that it's not expected by everyone, particularly if you haven't spent a long time thinking about how computers really work), I suggest we move this to the discussion forum. |
Beta Was this translation helpful? Give feedback.
-
Sure! Thanks again for the answer. |
Beta Was this translation helpful? Give feedback.
-
Just wanted to say thanks @apcamargo and @bqminh for documenting this so thoroughly! I was hoping to get some exact reproducibility of trees for unit testing and continuous integration tests. The combination of 1) Identical Input Files, 2) random seed |
Beta Was this translation helpful? Give feedback.
There are two questions:
Can sequence order input change the tree topology? YES, even with the same seed number. IQ-TREE at the beginning uses random stepwise addition technique to construct initial parsimony trees, where different orders may result in different parsimony trees, and subsequent tree search may diverge in different trees.
Can different thread number result in different trees? YES, even with the same seed number. The explanation is a bit involved but revolved around this inequality:
While they are mathematically equivalent, they are not always the same in a computer due to rounding errors (there are many articles about this on the web). As an examp…