Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow historical samples in inside_outside method #214

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

hyanwong
Copy link
Member

Previously truncate_prior wasn't actually being called!

if np.any(tree_sequence.tables.nodes.time[tree_sequence.samples()] != 0):
        if False:  # This is never called
            priors = truncate_priors(...)

@hyanwong
Copy link
Member Author

hyanwong commented Aug 19, 2022

I have been adding extra functionality to this PR - I guess that's fine @awohns ?!

@hyanwong
Copy link
Member Author

hyanwong commented Sep 6, 2022

We also need to implement tests for build_prior with the parameters allow_historical_samples, truncate_priors, and node_var_override. Also a test for the internal _truncate_priors function. But once this is done, and if performance is not any worse than before, I wonder if this PR should be squashed and merged into main, and then we can figure out the issue in https://github.com/hyanwong/molecular-sample-dating/blob/main/notebooks/uniform_prior.ipynb

@hyanwong hyanwong changed the base branch from ancient_samples to main September 8, 2022 11:10
@hyanwong hyanwong changed the title Allow truncate_prior to actually function! Allow historical samples Sep 8, 2022
@awohns
Copy link
Member

awohns commented Sep 15, 2022

@hyanwong, one of the changes in the inside pass has dramatically slowed it down, I'm not sure which one

@hyanwong
Copy link
Member Author

@hyanwong, one of the changes in the inside pass has dramatically slowed it down, I'm not sure which one

OK, useful to know. I'll try to track it down.

@awohns
Copy link
Member

awohns commented Sep 15, 2022

Hmm, after a bit of testing on simulated tree sequences I'm not seeing a difference between the current version of tsdate and this PR.
I think the inferred tree sequence I'm working with might have some strange features that really slow the inside step down (never seen this before though) rather than any of the changes in this PR

@hyanwong
Copy link
Member Author

Yep, I had a look, and can't see anything that should have slowed it down.

It might be worth having a perf suite like msprime (asv I think it's called), but I suspect it might be a hassle to set up and run.

Rework build-prior and inside / outside logic to allow historical samples

And speed up time constraint algorithms while also allowing nodes to be out of time order
@hyanwong
Copy link
Member Author

hyanwong commented Jan 9, 2023

I've rebased this and run it on a reasonable-sized tree sequence and it doesn't seem to change the runtimes. I reckon we should think about merging it, and get a review from someone. The ability to include ancient samples is a big win, and we should get it in the next release.

@hyanwong
Copy link
Member Author

The API for build_grid is changed to include the following:

:param string prior_distribution: What distribution to use to approximate the
         conditional coalescent prior. Can be "lognorm" for the lognormal distribution
         (generally a better fit, but slightly slower to calculate) or "gamma" for the
         gamma distribution (slightly faster, but a poorer fit for recent nodes).
         Default: "lognorm"
     :param bool allow_historical_samples: should we allow historical samples (i.e. at
         times > 0). This invalidates the assumptions of the conditional coalescent, but
         may be acceptable if the historical samples are recent or if there are many
         contemporaneous samples. Default: ``False``
     :param bool truncate_priors: If there are historical samples, should we truncate the
         priors of all nodes which are their ancestors so that the probability of being
         younger than the oldest descendant sample is zero. As long as historical
         samples do not have ancestors that have been misassigned in the tree sequence
         topology, this should give better results. Default: ``True``
     :param dict nonfixed_sample_var: is a dict mapping sample node IDs to a variance
         value. Any nodes listed here will be treated as non-fixed nodes whose prior is
         not calculated from the conditional coalescent but instead are allocated a prior
         whose mean is the node time in the tree sequence and whose variance is the
         value in this dictionary. This allows sample nodes to be treated as nonfixed
         nodes, and therefore dated. If ``None`` (default) then all sample nodes are
         treated as occurring at a fixed time (as if this were an empty dict).

@hyanwong hyanwong changed the title Allow historical samples Allow historical samples in inside_outside method Jul 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants