Merge pull request #299 from r-causal/po-assumptions-tt-sm

Overhaul Ch3 and Ch4
r-causal · Dec 23, 2024 · d6d4169 · d6d4169
2 parents 39042f0 + 0533e6d
commit d6d4169
Show file tree

Hide file tree

Showing 24 changed files with 1,888 additions and 1,126 deletions.
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -28,6 +28,7 @@ Imports:
     here,
     janitor,
     kableExtra,
+    katex,
     labelled,
     lmw,
     lubridate,

diff --git a/_quarto.yml b/_quarto.yml
@@ -33,10 +33,10 @@ book:
       chapters:
       - chapters/01-casual-to-causal.qmd
       - chapters/02-whole-game.qmd
-      - chapters/03-counterfactuals.qmd
-      - chapters/04-target-trials-std-methods.qmd
-      - chapters/05-dags.qmd
-      - chapters/06-not-just-a-stats-problem.qmd
+      - chapters/03-po-counterfactuals.qmd
+      - chapters/04-dags.qmd
+      - chapters/05-not-just-a-stats-problem.qmd
+      - chapters/06-stats-models-ci.qmd
 
     - part: The Design Phase
       chapters:

diff --git a/chapters/01-casual-to-causal.qmd b/chapters/01-casual-to-causal.qmd
@@ -123,13 +123,13 @@ In this (simplified) recreation of their plot from July 2020, you can see the st
 ft_excess_deaths <- read_csv(here::here("data/ft_excess_deaths.csv")) |>
   mutate(year = factor(year))
 
-excess_deaths_prior_wk <- ft_excess_deaths %>%
+excess_deaths_prior_wk <- ft_excess_deaths |>
   filter(year != 2020)
 
-excess_deaths2020_wk <- ft_excess_deaths %>%
+excess_deaths2020_wk <- ft_excess_deaths |>
   filter(year == 2020)
 
-excess_deaths_wk <- ft_excess_deaths %>%
+excess_deaths_wk <- ft_excess_deaths |>
   filter(year == 2020)
 
 ggplot(
@@ -212,7 +212,7 @@ It helps us understand the population we're working with, the distribution of th
 It also helps us be sure that the data structure we're using matches the question we're trying to answer, as we'll see in [Chapter -@sec-data-causal].
 You should always do descriptive analyses of your data when conducting causal research.
 
-Finally, as we'll see in [Chapter -@sec-trials-std], there are certain circumstances where we can make causal inferences with basic statistics.
+Finally, as we'll see in [Chapter -@sec-strat-outcome], there are certain circumstances where we can make causal inferences with basic statistics.
 Be cautious about the distinction between the causal question and the descriptive component here, too: just because we're using the same calculation (e.g., a difference in means) doesn't mean that all descriptions you can generate are causal.
 Whether a descriptive analysis overlaps with a causal analysis is a function of the data and the question.
 
@@ -301,7 +301,7 @@ As with prediction and description, it's better to start with a clear, precise q
 In statistics and data science, particularly as we swim through the ocean of data of the modern world, we often end up with an answer without a question (e.g., `42`).
 This, of course, makes interpretation of the answer difficult.
 In @sec-diag, we'll discuss the structure of causal questions.
-We'll discuss philosophical and practical ways to sharpen our questions in [Chapter -@sec-counterfactuals] and [Chapter -@sec-trials-std].
+We'll discuss philosophical and practical ways to sharpen our questions in [Chapter -@sec-counterfactuals].
 
 ::: callout-note
 ## Causal inference and explanation

diff --git a/chapters/02-whole-game.qmd b/chapters/02-whole-game.qmd
@@ -68,7 +68,7 @@ We have substantial, robust evidence in favor of bed net use, but let's consider
 -   We may also want to estimate a different effect or the effect for another population than in previous trials.
     For example, both randomized and observational studies helped us better understand that insecticide-based nets improve malaria resistance in the entire community, not just among those who use nets, so long as net usage is high enough [@howard2000; @hawley2003].
 
-As we'll see in @sec-trials-std and @sec-g-comp, the causal inference techniques that we'll discuss in this book are often beneficial even when we're able to randomize.
+As we'll see in @sec-strat-outcome and @sec-g-comp, the causal inference techniques that we'll discuss in this book are often beneficial even when we're able to randomize.
 
 When we conduct an observational study, it's still helpful to think through the randomized trial we would run were it possible.
 The trial we're trying to emulate in this causal analysis is the *target trial.* Considering the target trial helps us make our causal question more accurate.

diff --git a/chapters/03-counterfactuals.qmd b/chapters/03-counterfactuals.qmd
diff --git a/chapters/03-po-counterfactuals.qmd b/chapters/03-po-counterfactuals.qmd
diff --git a/chapters/05-dags.qmd → chapters/04-dags.qmd b/chapters/05-dags.qmd → chapters/04-dags.qmd
@@ -752,7 +752,7 @@ sim_data <- podcast_dag |>
 sim_data
 ```
 
-Since we have simulated this data, we know that this is a case where *standard methods will succeed* (see @sec-standard) and, therefore, can estimate the causal effect using a basic linear regression model.
+Since we have simulated this data, we know that this is a case where we can estimate the causal effect using a basic linear regression model.
 @fig-dag-sim shows a forest plot of the simulated data based on our DAG.
 Notice the model that only included the exposure resulted in a spurious effect (an estimate of -0.1 when we know the truth is 0).
 In contrast, the model that adjusted for the two variables as suggested by `ggdag_adjustment_set()` is not spurious (much closer to 0).