-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.Rmd
170 lines (133 loc) · 4.7 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# halfmoon <img src="man/figures/logo.png" align="right" height="138" />
<!-- badges: start -->
[![R-CMD-check](https://github.com/r-causal/halfmoon/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/r-causal/halfmoon/actions/workflows/R-CMD-check.yaml)
[![Codecov test coverage](https://codecov.io/gh/malcolmbarrett/halfmoon/branch/main/graph/badge.svg)](https://app.codecov.io/gh/malcolmbarrett/halfmoon?branch=main)
[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
[![CRAN status](https://www.r-pkg.org/badges/version/halfmoon)](https://CRAN.R-project.org/package=halfmoon)
<!-- badges: end -->
> Within light there is darkness,
but do not try to understand that darkness.
Within darkness there is light,
but do not look for that light.
Light and darkness are a pair
like the foot before and the foot behind in walking.
-- From the Zen teaching poem [Sandokai](https://en.wikipedia.org/wiki/Sandokai).
The goal of halfmoon is to cultivate balance in propensity score models.
## Installation
You can install the most recent version of halfmoon from CRAN with:
``` r
install.packages("halfmoon")
```
You can also install the development version of halfmoon from [GitHub](https://github.com/) with:
``` r
# install.packages("devtools")
devtools::install_github("malcolmbarrett/halfmoon")
```
## Example: Weighting
halfmoon includes several techniques for assessing the balance created by propensity score weights.
```{r example}
library(halfmoon)
library(ggplot2)
# weighted mirrored histograms
ggplot(nhefs_weights, aes(.fitted)) +
geom_mirror_histogram(
aes(group = qsmk),
bins = 50
) +
geom_mirror_histogram(
aes(fill = qsmk, weight = w_ate),
bins = 50,
alpha = 0.5
) + scale_y_continuous(labels = abs)
# weighted ecdf
ggplot(
nhefs_weights,
aes(x = smokeyrs, color = qsmk)
) +
geom_ecdf(aes(weights = w_ato)) +
xlab("Smoking Years") +
ylab("Proportion <= x")
# weighted SMDs
plot_df <- tidy_smd(
nhefs_weights,
race:active,
.group = qsmk,
.wts = starts_with("w_")
)
ggplot(
plot_df,
aes(
x = abs(smd),
y = variable,
group = method,
color = method
)
) +
geom_love()
```
## Example: Matching
halfmoon also has support for working with matched datasets. Consider these two objects from the [MatchIt](https://github.com/kosukeimai/MatchIt) documentation:
```{r}
library(MatchIt)
# Default: 1:1 NN PS matching w/o replacement
m.out1 <- matchit(treat ~ age + educ + race + nodegree +
married + re74 + re75, data = lalonde)
# 1:1 NN Mahalanobis distance matching w/ replacement and
# exact matching on married and race
m.out2 <- matchit(treat ~ age + educ + race + nodegree +
married + re74 + re75, data = lalonde,
distance = "mahalanobis", replace = TRUE,
exact = ~ married + race)
```
One option is to just look at the matched dataset with halfmoon:
```{r}
matched_data <- get_matches(m.out1)
match_smd <- tidy_smd(
matched_data,
c(age, educ, race, nodegree, married, re74, re75),
.group = treat
)
love_plot(match_smd)
```
The downside here is that you can't compare multiple matching strategies to the observed dataset; the label on the plot is also wrong. halfmoon comes with a helper function, `bind_matches()`, that creates a dataset more appropriate for this task:
```{r}
matches <- bind_matches(lalonde, m.out1, m.out2)
head(matches)
```
`matches` includes an binary variable for each `matchit` object which indicates if the row was included in the match or not. Since downweighting to 0 is equivalent to filtering the datasets to the matches, we can more easily compare multiple matched datasets with `.wts`:
```{r}
many_matched_smds <- tidy_smd(
matches,
c(age, educ, race, nodegree, married, re74, re75),
.group = treat,
.wts = c(m.out1, m.out2)
)
love_plot(many_matched_smds)
```
We can also extend the idea that matching indicators are weights to weighted mirrored histograms, giving us a good idea of the range of propensity scores that are being removed from the dataset.
```{r}
# use the distance as the propensity score
matches$ps <- m.out1$distance
ggplot(matches, aes(ps)) +
geom_mirror_histogram(
aes(group = factor(treat)),
bins = 50
) +
geom_mirror_histogram(
aes(fill = factor(treat), weight = m.out1),
bins = 50,
alpha = 0.5
) + scale_y_continuous(labels = abs)
```