Specify particular causal sites #121

jeromekelleher · 2023-11-22T17:02:56Z

I think we do need to provide some mechanism for specifying particular causal sites, or we lose a lot of flexibility and much of the richness that simulating based on ARGs provides. For example, we may want to simulate multiple causal sites that arose on a single ancestral haplotype, or restrict to mutations that occurred within a given population.

One approach might be:

def sim_trait(ts, model, *, num_causal=None, causal_sites=None, alpha=None, random_seed=None):

    if num_causal is not None and causal_sites is not None:
           raise ValueError("Cannot specify both num_causal and causal_sites")
    # More input validation
    if num_causal is not None:
        causal_sites = rng.choice(ts.num_sites, size=num_causal, replace=False)
        causal_sites.sort()
    # Run the simulation based on causal_sites

Here, causal sites would need to be a sorted list of site IDs, which I guess is OK?

It would probably be good to cook up a few examples demonstrating this, so that we can prove to ourselves that it does provide the flexibility we want.

Earlier discussions: #53

jeromekelleher · 2023-11-23T09:26:49Z

Note that we could imagine doing something like this too for the population-specific case:

def sim_trait(ts, model, *, num_causal=None, causal_sites=None, population=None, alpha=None, random_seed=None):

    if num_causal is not None and causal_sites is not None:
           raise ValueError("Cannot specify both num_causal and causal_sites")
    if causal_sites is not None and population is not None:
          raise ValueError("Cannot specify both population and causal_sites")
    # More input validation
    if population is not None:
           # something like ts.nodes_population[ts.mutations_node] == population
           # then chose num_causal from the matching sites (and the correct causal state, too)
           
    else:
        if num_causal is not None:
             causal_sites = rng.choice(ts.num_sites, size=num_causal, replace=False)
             causal_sites.sort()
    assert causal_sites is not None
    # Run the simulation based on causal_sites

GertjanBisschop · 2023-11-23T12:01:43Z

I guess we need to define first what population-specific means. Here, population-specific means that only mutations that arose in a specific population contribute to the phenotype of interest. You could also say that population-specific should mean that the effect size is population-specific. So depending on the population you are in, a mutation will have a different effect on your phenotype.

jeromekelleher · 2023-11-23T13:39:37Z

Huh, yes. We can start with a documentation example showing how to do the "arose in a given population" interpretation, and think later about the other one. It's unclear to me what the model is then, though - isn't your interpretation that the environmental noise varies by population?

daikitag · 2023-12-04T13:44:53Z

Made a new pull request in #124.

I'm also not entirely sure what population-specific means, so shall we just focus on specifying causal sites at first? I thought this point was very good after reading the draft of the paper.

GertjanBisschop · 2023-12-04T13:55:12Z

I think we agreed that for the examples I will add to the documentation we would limit ourselves to filtering variants that arose during either within a specific population or a specific time band, and then pass a subset of those IDs to sim_trait or sim_phenotype.

daikitag linked a pull request Jan 23, 2024 that will close this issue

CODE: causal_sites #124

Merged

mergify bot closed this as completed in #124 Jan 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specify particular causal sites #121

Specify particular causal sites #121

jeromekelleher commented Nov 22, 2023

jeromekelleher commented Nov 23, 2023

GertjanBisschop commented Nov 23, 2023

jeromekelleher commented Nov 23, 2023

daikitag commented Dec 4, 2023

GertjanBisschop commented Dec 4, 2023

Specify particular causal sites #121

Specify particular causal sites #121

Comments

jeromekelleher commented Nov 22, 2023

jeromekelleher commented Nov 23, 2023

GertjanBisschop commented Nov 23, 2023

jeromekelleher commented Nov 23, 2023

daikitag commented Dec 4, 2023

GertjanBisschop commented Dec 4, 2023