Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify particular causal sites #121

Closed
jeromekelleher opened this issue Nov 22, 2023 · 5 comments · Fixed by #124
Closed

Specify particular causal sites #121

jeromekelleher opened this issue Nov 22, 2023 · 5 comments · Fixed by #124

Comments

@jeromekelleher
Copy link
Member

I think we do need to provide some mechanism for specifying particular causal sites, or we lose a lot of flexibility and much of the richness that simulating based on ARGs provides. For example, we may want to simulate multiple causal sites that arose on a single ancestral haplotype, or restrict to mutations that occurred within a given population.

One approach might be:

def sim_trait(ts, model, *, num_causal=None, causal_sites=None, alpha=None, random_seed=None):

    if num_causal is not None and causal_sites is not None:
           raise ValueError("Cannot specify both num_causal and causal_sites")
    # More input validation
    if num_causal is not None:
        causal_sites = rng.choice(ts.num_sites, size=num_causal, replace=False)
        causal_sites.sort()
    # Run the simulation based on causal_sites        

Here, causal sites would need to be a sorted list of site IDs, which I guess is OK?

It would probably be good to cook up a few examples demonstrating this, so that we can prove to ourselves that it does provide the flexibility we want.

Earlier discussions: #53

@jeromekelleher
Copy link
Member Author

Note that we could imagine doing something like this too for the population-specific case:

def sim_trait(ts, model, *, num_causal=None, causal_sites=None, population=None, alpha=None, random_seed=None):

    if num_causal is not None and causal_sites is not None:
           raise ValueError("Cannot specify both num_causal and causal_sites")
    if causal_sites is not None and population is not None:
          raise ValueError("Cannot specify both population and causal_sites")
    # More input validation
    if population is not None:
           # something like ts.nodes_population[ts.mutations_node] == population
           # then chose num_causal from the matching sites (and the correct causal state, too)
           
    else:
        if num_causal is not None:
             causal_sites = rng.choice(ts.num_sites, size=num_causal, replace=False)
             causal_sites.sort()
    assert causal_sites is not None
    # Run the simulation based on causal_sites        

@GertjanBisschop
Copy link
Member

I guess we need to define first what population-specific means. Here, population-specific means that only mutations that arose in a specific population contribute to the phenotype of interest. You could also say that population-specific should mean that the effect size is population-specific. So depending on the population you are in, a mutation will have a different effect on your phenotype.

@jeromekelleher
Copy link
Member Author

Huh, yes. We can start with a documentation example showing how to do the "arose in a given population" interpretation, and think later about the other one. It's unclear to me what the model is then, though - isn't your interpretation that the environmental noise varies by population?

@daikitag
Copy link
Collaborator

daikitag commented Dec 4, 2023

Made a new pull request in #124.

I'm also not entirely sure what population-specific means, so shall we just focus on specifying causal sites at first? I thought this point was very good after reading the draft of the paper.

@GertjanBisschop
Copy link
Member

I think we agreed that for the examples I will add to the documentation we would limit ourselves to filtering variants that arose during either within a specific population or a specific time band, and then pass a subset of those IDs to sim_trait or sim_phenotype.

@daikitag daikitag linked a pull request Jan 23, 2024 that will close this issue
@mergify mergify bot closed this as completed in #124 Jan 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants