Documentation for `explainable` for model with continuous variables #565

PoorvaGarg · 2024-08-23T18:36:57Z

This pull request adds a tutorial for the module explainable in the context of a model with continuous variables and dynamical systems.

SamWitty · 2024-11-04T22:08:27Z

Thanks for taking this on @rfl-urbaniak and @PoorvaGarg ! It's great progress towards a very interesting combination of previously very disconnected ideas. I have a fair number of comments because it's pretty long and detailed, but overall I really like where this is going. I've made a small number of editorial changes to the notebook itself, which can be found in #573 . The more substantive requests for changes are better left to discussion and then collaborative revision. With that, here are my comments:

Main Feedback:

The notebook appears to be missing a clear and obvious explanatory question to motivate the analysis, written in plain English at first and then translated into specific operations later when mathematical terms are introduced. My suggestion would be to expand the text after “only one of them is responsible, and we are interested in being able to identify which one”, and move this earlier and more prominently in the tutorial.
The notebook is missing a clear and obvious conclusion we should draw from this analysis. This conclusion should be emphasized strongly visually (e.g. text in bold or a small subsection), and point to specific visual evidence and concrete numeric quantities. Once the conclusion is made clear, it would be nice to talk a bit about the policy or scientific implications. In the conclusion the notebook states the following: “It is evident from the plot above that the counterfactual for lockdown has more probability mass in the top right quadrant (low overshoot in the necessity world and high overshoot in the sufficient world). This gives us a clearer picture into why lockdown has higher causal role in the overshoot being too high as compared to masking.” It does not give me a clear picture into why lockdown has higher causal role.
The notebook doesn’t discuss assumptions, possible violations of those assumptions, and their implications. This is particularly important because the model implicitly assumes that the structural functions (in this case, the SIR model) are known except for the specific parameters. As explanations in this notebook are derived from counterfactuals requiring strong assumptions like those found in this tutorial, there are many models where this kind of analysis would not be appropriate, or would otherwise need to modified to account for more uncertainty in the structural functions.

Detailed Feedback:

“Now we incorporate the Bayesian SIR model into a larger model that includes the effect of two different policies, lockdown and masking, where each can be implemented with $50%$ probability (these probabilities won't really matter, as we will be intervening on these, the sampling is mainly used to register the parameters with Pyro).” Why do we need pyro.sample sites rather pyro.deterministic sites here?

Why do we need the MaskedStaticIntervention?

I separated the policy_model into policy_model and overshoot_query, which is emphasizing that there’s some repeated code that could be refactored out further. Specifically, the get_overshoot function in the introduction could be vectorized when introduced, and then reused later in the actual definition of the model.

“Trajectories and overshoot distribution in the but-for analysis” requests:

Add axis labels to all plots
Make it clear that the rows correspond to the same intervention configuration. Currently it’s implied by them being next to each other, but the intervention configuration label is only on the leftmost plot.
Fix the range of the histograms to be shared across all. Currently, it’s hard to visually compare.
Show the mean as a vline on each histogram, and show the overshoot threshold as a vline on each histogram. Once this is done, you’ll need a single histogram legend.

importance_infer: This is important enough that I think it justifies a little more explanation. As is, an unfamiliar audience may be left wondering a few things:

Is importance sampling essential to the solution of this notebook, or is it just one of many plausible inference strategies from Pyro one could use?
What do the return values represent?
How do I get away with sampling importance weights a single time and then using them to answer many different questions downstream? (I believe the answer here is because we’ve broadcast the entire importance sampling procedure over all counterfactual worlds.)

Why do we need alternatives? Shouldn’t this always be equal to supports setminus antecedents?

Can we add some citations justifying “degree of responsibility”?

“The reader might have the impression that the numbers are relatively low: …” Could we suggest a more intuitive description of what these probabilities are? If they shouldn’t be interpreted as any conceptually meaningful probabilities, then we should emphasize that explicitly or consider removing.

“Counterfactual - necessity world” figure (and all following similar plots) requests:

The overlapping histogram plots is a bit difficult to interpret. Something about the way colors blend together is hard to read. Could you change this to multiple subplots with shared axes or change the color overlap somehow?

“Filter for the relevant context” -> Something about conditioning on the context nodes. I don’t like the word “filter” here.

“Comparing how necessity interventions for the two antecedents affect the overshoot” I don’t know what this means. I think the terminology of “necessity world” and “sufficiency world” is a little confusing. Is there a way we could reuse the mathematical notation introduced earlier to denote which counterfactual world we’re considering at any given point in the narrative?

Heatmap suggestions/questions:

We need a colorbar showing how the colors correspond to probability density.
How should I interpret the vertical and horizontal lines? I think I get that “mean overshoot” is the point at the intersection, but I don’t know how to interpret “overshoot too high”. Is this saying that any configuration above and to the right of that intersection is too high?

SamWitty

See above comment for requested changes.

* edit intro * progress * remove plate and more edits

PoorvaGarg and others added 30 commits August 1, 2024 17:02

tests for indepepdent and correct

080e39d

added print

7dd59ad

extra case

03516b7

debugged reverse

a7b3a8a

debug consequen_eq_neq

6ee8651

fixed test_consequent_eq_neq

513ec6e

fixed the test with dimensions

ce96b9f

consequent_eq_neq

1f8e72a

three variable model

bc5ffb6

testing three dependent

5aabfc6

debugging

776208f

minimal example for three independent variables

c0a22c0

more three variable models

1911de2

diverge

84381e2

debugged

bbc121c

notebook tested three variable models

498f070

three variable test cases aded

1e65c7d

clean up

4058660

test for factual log probs

34d0faf

more clean up

9326a52

fixed a lint error

45e75d6

lint clean

075c33a

reverted metadata

dde4d36

ground truth for conditioning on deterministic node

75e9f05

responsibility debug

7e2501a

documentation commit

2a38798

responsibility example

9254529

documentation completed

61aa26b

small typos

58842f6

small changes

9ee3068

rfl-urbaniak and others added 14 commits August 27, 2024 16:53

small edits

2f0d0b7

fixed heatmap prob

dabb4c2

math formulae and other tweaks

8b6a391

tweaks

2209626

revised the sir notebook

b796608

fix num samples

d7429a4

small fixes

b0d8e1d

toc change

ff0739e

formulae clarified with small changes

ca71be5

tweaks

ca1bc4c

tweaks

43a2e54

despine call

e68d103

grammar and small fixes

0bec84e

tweaks

7900524

PoorvaGarg marked this pull request as ready for review August 30, 2024 18:47

PoorvaGarg marked this pull request as draft August 30, 2024 18:47

PoorvaGarg added 3 commits August 30, 2024 14:53

Merge branch 'master' into explainable-continuous-doc

99adcc1

clean up

d6a63d7

html corrected

199dab4

PoorvaGarg marked this pull request as ready for review August 30, 2024 19:17

PoorvaGarg added status:awaiting review Awaiting response from reviewer and removed status:WIP Work-in-progress not yet ready for review labels Aug 30, 2024

PoorvaGarg requested review from SamWitty and eb8680 August 30, 2024 19:18

SamWitty removed the request for review from eb8680 November 4, 2024 22:08

SamWitty requested changes Nov 4, 2024

View reviewed changes

SamWitty added status:awaiting response Awaiting response from creator and removed status:awaiting review Awaiting response from reviewer labels Nov 4, 2024

Requested changes to explainable_sir.ipynb (#573)

e08ea78

* edit intro * progress * remove plate and more edits

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documentation for `explainable` for model with continuous variables #565

Documentation for `explainable` for model with continuous variables #565

PoorvaGarg commented Aug 23, 2024 •

edited

Loading

SamWitty commented Nov 4, 2024

SamWitty left a comment

Documentation for explainable for model with continuous variables #565

Are you sure you want to change the base?

Documentation for explainable for model with continuous variables #565

Conversation

PoorvaGarg commented Aug 23, 2024 • edited Loading

SamWitty commented Nov 4, 2024

SamWitty left a comment

Choose a reason for hiding this comment

Documentation for `explainable` for model with continuous variables #565

Documentation for `explainable` for model with continuous variables #565

PoorvaGarg commented Aug 23, 2024 •

edited

Loading