Merge pull request #197 from tgerke/tg-edits

minor chapter 4 edits
r-causal · Nov 7, 2023 · 6350185 · 6350185
2 parents 6b81afc + 5d3cf3b
commit 6350185
Showing 1 changed file with 37 additions and 7 deletions.
diff --git a/chapters/chapter-04.qmd b/chapters/chapter-04.qmd
@@ -46,7 +46,7 @@ In @tbl-protocol we map each of these elements to the corresponding assumption t
 
 Assumption | Eligibility Criteria | Exposure Definition| Assignment Procedures | Follow-up Period | Outcome Definition | Causal contrast | Analysis Plan
 ------------|----------------- | ------------------|---------|----------|----------|--------- | -------
-Consistency (Well defined exposure) | |`r emo::ji("heavy_check_mark")`|| | |
+Consistency (Well defined exposure) |`r emo::ji("heavy_check_mark")`|`r emo::ji("heavy_check_mark")`|| | |
 Consistency (No interference) | | `r emo::ji("heavy_check_mark")`|`r emo::ji("heavy_check_mark")` | | `r emo::ji("heavy_check_mark")` | | `r emo::ji("heavy_check_mark")`
 Positivity |`r emo::ji("heavy_check_mark")`||`r emo::ji("heavy_check_mark")`|| | | `r emo::ji("heavy_check_mark")`
 Exchangeability |`r emo::ji("heavy_check_mark")`||`r emo::ji("heavy_check_mark")`|`r emo::ji("heavy_check_mark")`|| | `r emo::ji("heavy_check_mark")`
@@ -82,7 +82,13 @@ ggplot(data, aes(x = x, y = y)) +
 
 ## Target Trials
 
-There are many reasons why randomization may not be possible. For example, it might not be ethical to randomly assign people to a particular exposure, there may not be funding available to run a randomized trial, or there might not be enough time to conduct a full trial. In these situations, we rely on observational data to help us answer causal questions by implementing a *target trial*. A *target trial* answers: What experiment would you design if you could? 
+There are many reasons why randomization may not be possible. For example, it might not be ethical to randomly assign people to a particular exposure, there may not be funding available to run a randomized trial, or there might not be enough time to conduct a full trial. In these situations, we rely on observational data to help us answer causal questions by implementing a *target trial*.
+
+A *target trial* answers: What experiment would you design if you could?
+Specifying a target trial is nearly identical to the process we described for a randomized trial.
+We define eligibility, exposure, follow-up period, outcome, estimate of interest, and the analysis plan. 
+The key difference with the target trial in the observational setting, of course, is that we cannot assign exposure.
+The analysis planning and execution step of the target trial is the most technically involved and a core focus of this book; e.g. using DAGs to ensure that we have measured and are controlling for the right set of confounders, composing statistical programs that invoke an appropriate adjustment method such as IP weighting, and conducting sensitivity analyses to assess how sensitive our conclusions are to unmeasured confounding or misspecification. 
 
 ## Causal inference with `group_by()` and `summarize()` {#sec-group-sum}
 
@@ -134,7 +140,31 @@ sim <- tibble(
 
 Here we have one binary `confounder`, the probability that `confounder = 1` is `0.5`.
 The probability of the being exposed is `0.75` for those for whom `confounder = 1` `0.25` for those for whom `confounder = 0`.
-There is no effect of the `exposure` on the `outcome` (the true causal effect is 0); the `outcome` effect is fully dependent on the `confounder`. In this simulation we generate the potential outcomes to drive home our assumptions; many of our simulations in this book will skip this step.
+There is no effect of the `exposure` on the `outcome` (the true causal effect is 0); the `outcome` effect is fully dependent on the `confounder`.
+
+```{r}
+#| label: basic-dag
+#| echo: false
+#| warning: false
+#| fig-cap: "Causal Diagram of Classic Confounding"
+library(ggdag)
+
+coords <- list(
+  x = c(confounder = 1, exposure = 2, outcome = 3),
+  y = c(confounder = -1, exposure = 0, outcome = 0)
+)
+
+dag <- dagify(
+  outcome ~ confounder,
+  exposure ~ confounder,
+  coords = coords
+)
+
+ggdag(dag) +
+  theme_dag()
+```
+
+In this simulation we generate the potential outcomes to drive home our assumptions; many of our simulations in this book will skip this step.
 Let's look at this generated data frame.
 
 ```{r}
@@ -144,7 +174,7 @@ sim |>
 
 Great! Let's begin by proving to ourselves that this violates the exchangeability assumption. Recall from @sec-assump:
 
-> **Exchangeability**: We assume that within levels of relevant variables (confounders), exposed and unexposed subjects have an equal likelihood of experiencing any outcome prior to exposure; i.e. the exposed and unexposed subjects are exchangeable. This assumption is sometimes referred to as **no unmeasured confounding**.
+> **Exchangeability**: We assume that within levels of relevant variables (confounders), exposed and unexposed subjects have an equal likelihood of experiencing any outcome prior to exposure; i.e. the exposed and unexposed subjects are exchangeable. This assumption is sometimes referred to as **no unmeasured confounding**, though exchangeability implies more than that, such as no selection bias and that confounder relationships are appropriately specified. We will further define exchangeability through the lens of DAGs in the next chapter.
 
 Now, let's try to estimate the effect of the `exposure` on the `outcome` assuming the two exposed groups are exchangeable.
 
@@ -174,7 +204,7 @@ sim |>
 ```
 
 Ok, so assuming the exposure groups are exchangeable (and assuming the rest of the assumptions from @sec-assump hold), we estimate the effect of the exposure on the outcome to be 0.53.
-We *know* the exchaneability assumption is violated based on how we simulated the data.
+We *know* the exchangeability assumption is violated based on how we simulated the data.
 How can we estimate an unbiased effect?
 The easiest way to do so is to estimate the effect within each confounder class.
 This will work because folks with the same value of the confounder have an equal probability of exposure.
@@ -351,7 +381,7 @@ When you have no confounders and there is a linear relationship between the expo
 Even in these cases, using the methods you will learn in this book can help.
 
 1.  Adjusting for baseline covariates can make an estimate *more efficient*
-2.  Propensity score weighting is *more efficient* that direct adjustment
+2.  Propensity score weighting is *more efficient* than direct adjustment
 3.  Sometimes we are *more comfortable with the functional form of the propensity score* (predicting exposure) than the outcome model
 
 Let's look at an example.
@@ -558,7 +588,7 @@ Three ways to estimate a causal effect in a non-randomized setting
 
 First, let's look at @tbl-panel-2-1.
 Here, we see that the unadjusted effect is *biased* (it differs from the true effect, 1, and the true effect is *not* contained in the reported 95% confidence interval).
-Now lets compare @tbl-panel-2-2 and @tbl-panel-2-3.
+Now let's compare @tbl-panel-2-2 and @tbl-panel-2-3.
 Technically, both are estimating unbiased causal effects.
 The output in the `Beta` column of @tbl-panel-2-2 is technically a *conditional* effect (and often in causal inference we want marginal effects), but because there is no treatment heterogeneity in this simulation, the conditional and marginal effects are equal.
 @tbl-panel-2-3, using the propensity score, also estimates an unbiased effect, but it is no longer the most *efficient* (that was true when the baseline covariates were merely causal for `y`, now that they are `confounders` the efficiency gains for using propensity score weighting are not as clear cut).