-
Hi, This leads to my question: are some of these substitution models mislabelled? Regards, |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 2 replies
-
I'm sure it's not mislabeling. There's been some discussions about that in the google group. Q.bird and Q.mammal are actually quite similar (see PCA figure 3 of QMaker paper https://academic.oup.com/sysbio/article/70/5/1046/6146362). I'd guess that for your dataset, Q.bird happens to be better than Q.mammal (maybe doing so only slightly), but it might not be the case for other datasets. Suggestion: because you seem to have a lot a data, I'd recommend to estimate a Q matrix for your own dataset, and use it to infer a tree. This is what we suggested in QMaker. |
Beta Was this translation helpful? Give feedback.
-
PS: Buy me a coffee if you think it's helpful :-) https://buymeacoffee.com/bqminh |
Beta Was this translation helpful? Give feedback.
-
Hi Minh,
Ah, yes – the two models are placed quite close to one another in Fig 3 of that paper. The difference between Q.bird and Q.mammal is huge (2973.918 BIC), which is partly why I was perplexed.
Your suggestion is sensible. I ran the analysis last year and got a vastly improved estimate.
I also tried the NQ models for the same data, but IQ-TREE 2 aborted with an error. I uploaded the error to GitHub on 12 Oct 2023, but I don’t think anyone has found the bug yet.
All the best,
Lars
From: Bui Quang Minh ***@***.***>
Date: Thursday, 2 May 2024 at 14:19
To: iqtree/iqtree2 ***@***.***>
Cc: Jermiin, Lars ***@***.***>, Author ***@***.***>
Subject: Re: [iqtree/iqtree2] Possible mislabelling of Q.bird substitution model (Discussion #186)
EXTERNAL EMAIL: This email originated outside the University of Galway. Do not open attachments or click on links unless you believe the content is safe.
RÍOMHPHOST SEACHTRACH: Níor tháinig an ríomhphost seo ó Ollscoil na Gaillimhe. Ná hoscail ceangaltáin agus ná cliceáil ar naisc mura gcreideann tú go bhfuil an t-ábhar sábháilte.
I'm sure it's not mislabeling. There's been some discussions about that in the google group. Q.bird and Q.mammal are actually quite similar (see PCA figure 3 of QMaker paper https://academic.oup.com/sysbio/article/70/5/1046/6146362). I'd guess that for your dataset, Q.bird happens to be better than Q.mammal (maybe doing so only slightly), but it might not be the case for other datasets.
Suggestion: because you seem to have a lot a data, I'd recommend to estimate a Q matrix for your own dataset, and use it to infer a tree. This is what we suggested in QMaker.
—
Reply to this email directly, view it on GitHub<#186 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AG6JXECW3A5N4ALDD5SMI6DZAI4MHAVCNFSM6AAAAABHDJ7LCWVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TEOJVGYYTQ>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
I double checked and @bqminh is right - no mislabelling. I'll post some code to do it because it can help with checking any model one might estimate.
To check, I randomly sampled 100 loci from the datasets we used to estimate the mammal and bird models and ran ModelFinder on those loci like this (the nexus file has the partitions and the sequence):
Then I counted up the models best fit to each of the 100 loci like this:
The results. For the 100 bird loci: