-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug assertion in ts.allele_frequency_spectrum
#2923
Comments
ts.allele_frequency_spectrum
ts.allele_frequency_spectrum
Here's a MWE:
The issue here is that all the mutations to the same allele (here The mutation.parent is what's letting us properly count mutations that are nested in others; Even in cases without the bug assert, afs is reporting the wrong thing (heck, probably all the stats are). |
Thanks Peter -- so, is the tskit_bug_assert the right exit point here, or should it raise a more specific error? If the former I'll go ahead and close this. |
Don't close it, I think? It would be nice to give a more informative error. The problem is definitely not with |
For context, I think we decided previously that "gee there aren't very many softwares producing tree sequences; they'll just all need to learn to use |
There's a general issue here that we've been too lax with checking mutation parents. We'll probably just have to figure out how to detect whether mutation parents are set properly at load time, and start raising errors if not. Otherwise these types of error will start creeping in more and more. |
I'm pretty sure the only reasonable way to do that is to just run |
Your probably right. We could check for cheaper things like if we have two mutations on a branch, the parent must be set, but that's a fairly limited check. So, I guess the proposal would be to run |
with: afs_error.trees.gz
This example has one tree and one site. The segfault seems to be linked to the fact that there are multiple mutations at the site, but
mutations.parent
is tskit.NULL even when mutations are beneath another-- that is, if I dump tables and correct the mutations.parent column to reflect what mutation replaced what, then everything works.This is output from SINGER, so I'll let them know that mutations.parent should be set during conversion to a tree sequence. But, it'd be nice to have the above example run through, or at least throw an informative error (note that ts.diversity works, although it returns a negative number for the "sample_sets" case above).
The text was updated successfully, but these errors were encountered: