-
Notifications
You must be signed in to change notification settings - Fork 4
/
Copy pathsfhip_statistical_analysis.Rmd
535 lines (436 loc) · 23.3 KB
/
sfhip_statistical_analysis.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
```{r load_packages, echo=FALSE}
# Setup packages ---------------------------------------------------------------
# List of packages for session
.packages = c("plyr",
"ggplot2",
"knitr",
"grid",
"data.table",
"dplyr",
"vegan",
"reshape2"
)
# Install CRAN packages (if not already installed)
.inst <- .packages %in% installed.packages()
if(length(.packages[!.inst]) > 0) install.packages(.packages[!.inst])
# Load packages into session
suppressMessages(lapply(.packages, require, character.only=TRUE))
cat("\014") # Clear console
```
```{r, echo=FALSE}
opts_knit$set(root.dir="~/datadive_201503_sf-health-improvement-partnership/")
opts_chunk$set(fig.width=10, fig.height=5, dpi=120, warning=F, echo=F,
cache=T, message=F)
```
```{r theme}
custom_theme <- function(text_size=10, legend_size=0.4, margin_size=0.01) {
# Custom ggplot theme
return (theme(axis.text.x=element_text(angle=-90, size=text_size),
axis.text.y=element_text(size=text_size),
strip.text.x=element_text(size=text_size),
strip.text.y=element_text(size=text_size),
strip.background=element_rect(fill="white", color="grey"),
legend.key.size=unit(legend_size, "cm"),
legend.title=element_text(size=text_size),
legend.text=element_text(size=text_size),
title=element_text(size=text_size + 1),
panel.margin=unit(margin_size, "cm"),
panel.background = element_rect(fill="white", color="grey"),
panel.grid=element_line(colour="black"),
axis.ticks.margin=unit(margin_size, "cm"),
plot.margin=unit(rep(margin_size, 4), "cm"),
legend.position="right",
legend.margin=unit(margin_size, "cm")))
}
```
---
title: "SF-HIP Statistical Analysis"
author: Kris, Terence, Chao, Ray, and Violet
date: "March 29, 2015"
output: pdf_document
---
# Introduction
In this document, we collect the statistical approaches and insights from DataKind's
March 27-29, 2015 DataDive. While we highlight our most interesting findings, we
also wanted to share alternative lines of thought we had developed but not completely
refined, with the hope that this could guide future analysis.
## Goals
There are two overall goals of interest in this analysis: Global patterns and local
anomalies. On the one hand, we would like to summarise the relationships between
the presence of liquor stores and crimes that will apply generally across
San Francisco. On the other hand, we would like to highlight those locations
which somehow seem to deviate from any general patterns. In both cases, we would
like to integrate demographic information. For example, are there locations with
similar demograhpics and numbers of liquor stores, but very different crime rates?
## Approach
### Data Available
We consider three primary sources of data:
- Census information: Demographic information at the census tract level. This
includes overall population, population breakdown across races, unemployment
rate, and median income.
- Crime data: We have crime reports (from the SFPD?), mapping the time and place
of crimes within the city over the last 10 years. These crime reports also include
descriptions of the type of crime at varying levels of granularity -- a report
may be classified at a coarse level as robbery, and at a fine level as robbery
at an ATM machine, for instance.
- Alcohol license data: We have records of alcohol licenses over more than a decade.
These licenses are required by any venue that sells or distributes alcohol, including
bars, clubs, and convenience stores. These records include the location of these
vendors, as well as a license type (bars and liquor stores require different
licenses, for example).
### Data used
We chose to aggregate the crime and alcohol license data to the census tract
level, and then normalize by census tract population. More specifically, we
(1) counted the number of venues using each of the 23 license types within each census
tract, then divided by the population of that census tract and (2) counted
the number of crimes within each of the 30 description groups. We could have
used finer or coarser description types for both liquor vendors and
crimes, but this level seemed to offer a rich description without making the
problem too high dimensional, and less tractable. Further, we discarded those tracts
with fewer than 500 people living within them, since
our estimates of the densities of crimes and liquor venues in such sparsely
populated areas are less reliable.
Notice that, at this stage, we have ignored (1) any spatial information at a finer resolution than the census tract level and (2) any
temporal effects. Nonetheless, we believe our methods could be generalized to handle
these situations as well.
### Methods
For the first task, identifying global patterns, our overal methods use two
steps: dimension reduction followed by some measure of association. By dimension
reduction, we mean reducing many different measurements to just a few -- for example,
the number of college educated people and the median income of a census tract can
both be explained by an underlying "affluence" effect. By association, we mean
taking these underlying factors and determining whether and how they are correlated.
The specific tools we applied to do this reduction vary in complexity. From
crudest to most (but still not very) sophisticated, we used
-Crudest dimension reduction is just summing counts
- Another crude dimension reduction is ignoring dimensions
- Once we cluster based on two sets of vars, see how the clusterings compare
- See how the distances compare
- Perform dimension reduction jointly
# Regression on Aggregated Counts
- First, just summing counts of licenses
- We need to compare rates
- We find that most rates are very small, with a few getting much larger, so we
use a log scale
- As a response, look at the liquor and all crimes
- Make plots, just to see association, follow-up with formal regressions to validate
visual intuition
- Regress with control covariates
## Liquor Crimes ---
- Code for liquor crimes only
Setup, since haven't run code yet.
```{r regress_liquor_crime}
# Load the data
crime_census_alcohol <- read.csv("~/datadive_201503_sf-health-improvement-partnership/data/processed_data/crime_census_alcohol.csv")
# Filter down to those we have reliable rate estimates for
crime_census_alcohol <- filter(crime_census_alcohol, Pop2010 > 1000)
# Give labels to interesting columns
all_cols <- colnames(crime_census_alcohol)
license_cols <- all_cols[grep("license", all_cols)]
crime_cols <- all_cols[grep("crime", all_cols)]
census_cols <- setdiff(all_cols, c(license_cols, crime_cols, "Tract2010"))
# Labels for crimes that are more plausibly related to alcohol
alcohol_crime_cols <- c("crime_liquor_laws", "crime_driving_under_the_influence",
"crime_drunkenness", "crime_loitering",
"crime_family_offenses", "crime_drug_narcotic",
"crime_disorderly_conduct")
violent_crime_cols <- c("crime_assault", "crime_robbery",
"crime_sex_offenses_forcible", "crime_suicide")
```
```{r}
license_total <- rowSums(crime_census_alcohol[, license_cols])
alcohol_crime_total <- rowSums(crime_census_alcohol[, alcohol_crime_cols])
violent_crime_total <- rowSums(crime_census_alcohol[, violent_crime_cols])
# Log transform (see appendix for motivation)
crime_census_alcohol[, license_cols] <- crime_census_alcohol[, license_cols] + 1
crime_census_alcohol[, c(license_cols, crime_cols)] <- log(1 + crime_census_alcohol[, c(license_cols, crime_cols)])
```
## Alcohol Related Crimes ---
- Interpretation of plot, for all grouped together
+ Increasing density of liquor establishments is associated with an increase
in number of liquor related crimes (mention what they are)
+ For a fixed density of liquor establishments, there is still a substantial variation
across crime rates
- Condition on income level, unemployment
+ Look at shaded plots: pattern is not so strong by eye. Different colors mean
different levels of income, unemployment within census tract
+ Look at faceted plots
- Interpretations
+ Seems like the effect size of increases in liquor store density on liquor crime
density are larger in neighorhoods with below median income
+ Run regressions to quantify these differences
+ Run joint regression y ~ beta0 + (beta1 + beta2 * I(above median)) * liquor
* This turns out to not be significant. So, even though visually we see one
pattern, this is not formally statistically significant
+ Run joint regression y ~ beta0 + beta1 * liquor + beta2 * median income
* Idea is now can interpret liquor store effect controlling for median income
```{r}
# Combine data
above_median_inc <- I(crime_census_alcohol$med_income > median(crime_census_alcohol$med_income))
below_median_unemp <- I(crime_census_alcohol$Unemploy_p < median(crime_census_alcohol$Unemploy_p))
liquor_laws_data <- cbind(license_total, alcohol_crime_total, crime_census_alcohol[, c("med_income", "Unemploy_p")], above_median_inc, below_median_unemp)
# Make plots
ggplot(liquor_laws_data) +
geom_point(aes(x=license_total, y=alcohol_crime_total, col=med_income), size=4) +
scale_color_gradient2(midpoint=7e4, mid="plum", high="red", low="steelblue") +
scale_x_log10() +
scale_y_log10() +
ggtitle("Alcohol Access Density vs. Liquor Crime Density, colored by income") +
custom_theme()
ggplot(liquor_laws_data) +
geom_point(aes(x=license_total, y=alcohol_crime_total, col=Unemploy_p), size=4) +
scale_x_log10() +
scale_y_log10() +
scale_color_gradient2(midpoint=10, mid="plum", high="red", low="steelblue") +
ggtitle("Alcohol Access Density vs. Liquor Crime Density, colored by unemployment") +
custom_theme()
```
```{r}
# Make plots
ggplot(liquor_laws_data) +
geom_point(aes(x=license_total, y=alcohol_crime_total, col=med_income), size=4) +
scale_color_gradient2(midpoint=7e4, mid="plum", high="darkblue", low="red") +
scale_x_log10() +
scale_y_log10() +
facet_wrap(~above_median_inc) +
ggtitle("Alcohol Access Density vs. Alcohol-related Crime Density, tracts above and below median income") +
custom_theme()
ggplot(liquor_laws_data) +
geom_point(aes(x=license_total, y=alcohol_crime_total, col=Unemploy_p), size=4) +
scale_x_log10() +
scale_y_log10() +
facet_wrap(~below_median_unemp) +
scale_color_gradient2(midpoint=10, mid="plum", high="red", low="darkblue") +
ggtitle("Alcohol Access Density vs. Alcohol-Related Crime Density, tracts above and below median unemployment") +
custom_theme()
```
```{r}
# Not actually significant in interaction model...
liquor_law_interact_inc_model <- lm(log(1 + alcohol_crime_total) ~ log(1 + license_total) + log(1 + license_total) * above_median_inc, data=liquor_laws_data)
summary(liquor_law_interact_inc_model)
# Not even unemployment makes a difference
liquor_law_interact_unemp_model <- lm(log(1 + alcohol_crime_total) ~ log(1 + license_total) + log(1 + license_total) * below_median_unemp, data=liquor_laws_data)
summary(liquor_law_interact_unemp_model)
# Pretty significant here
liquor_law_inc_control_model <- lm(log(1 + alcohol_crime_total) ~ log(1 + license_total) + log(1 + license_total) + med_income, data=liquor_laws_data)
summary(liquor_law_inc_control_model)
# Kinda significant here
liquor_law_unemp_control_model <- lm(log(1 + alcohol_crime_total) ~ log(1 + license_total) + Unemploy_p, data=liquor_laws_data)
summary(liquor_law_unemp_control_model)
# not signifiant here though...
liquor_law_unemp_interaction_model <- lm(log(1 + alcohol_crime_total) ~ log(1 + license_total) + Unemploy_p + log(1 + license_total) * Unemploy_p, data=liquor_laws_data)
summary(liquor_law_unemp_interaction_model)
```
## Violent Crimes---
+ What about violent crimes? Repeat same bullets as above
```{r}
liquor_laws_data <- cbind(license_total, violent_crime_total, crime_census_alcohol[, c("med_income", "Unemploy_p")], above_median_inc, below_median_unemp)
# Make plots
ggplot(liquor_laws_data) +
geom_point(aes(x=license_total, y=violent_crime_total, col=med_income), size=4) +
scale_color_gradient2(midpoint=7e4, mid="plum", high="darkblue", low="red") +
scale_x_log10() +
scale_y_log10() +
facet_wrap(~above_median_inc) +
ggtitle("Alcohol Access Density vs. Alcohol-related Crime Density, tracts above and below median income") +
custom_theme()
ggplot(liquor_laws_data) +
geom_point(aes(x=license_total, y=violent_crime_total, col=Unemploy_p), size=4) +
scale_x_log10() +
scale_y_log10() +
facet_wrap(~below_median_unemp) +
scale_color_gradient2(midpoint=10, mid="plum", high="red", low="darkblue") +
ggtitle("Alcohol Access Density vs. Alcohol-Related Crime Density, tracts above and below median unemployment") +
custom_theme()
```
```{r}
# Not actually significant in interaction model...
liquor_law_interact_inc_model <- lm(log(1 + violent_crime_total) ~ log(1 + license_total) + log(1 + license_total) * above_median_inc, data=liquor_laws_data)
summary(liquor_law_interact_inc_model)
# Not even unemployment makes a difference
liquor_law_interact_unemp_model <- lm(log(1 + violent_crime_total) ~ log(1 + license_total) + log(1 + license_total) * below_median_unemp, data=liquor_laws_data)
summary(liquor_law_interact_unemp_model)
# Pretty significant here
liquor_law_inc_control_model <- lm(log(1 + violent_crime_total) ~ log(1 + license_total) + log(1 + license_total) + med_income, data=liquor_laws_data)
summary(liquor_law_inc_control_model)
# Kinda significant here
liquor_law_unemp_control_model <- lm(log(1 + violent_crime_total) ~ log(1 + license_total) + Unemploy_p, data=liquor_laws_data)
summary(liquor_law_unemp_control_model)
# Significant interaction
liquor_law_unemp_interaction_model <- lm(log(1 + violent_crime_total) ~ log(1 + license_total) + Unemploy_p + log(1 + license_total) * Unemploy_p, data=liquor_laws_data)
summary(liquor_law_unemp_interaction_model)
```
# General Patterns in Unaggregated Counts ---
- Everything so far has been one dimensional, really ought to consider
different crimes and liquor types separately, do more intelligent aggregation
- Some groups of license types / crime types are quantitatively grouped together [still focus on liquor related crimes]
- Further, which license types are associated with which types of crimes?
Which groups of license types are associated with which groups of crimes?
- Do any tracts show an especially high level of (1) one group
of licenses, or (2) one group of crimes? What about groups of tracts?
- The point is that it's easy to speak about single crime and license types at the single tract level, as well as the sum across crimes, licenses and tracts. But, we would appreciate intermediate levels of resolution if there is some grouping on these variables (but the regime is not entirely homogeneous)
- Two very general statistical tools available for this sort of reduction are
multivariate analysis and clustering.
## Cluster Analysis
### Clustering tracts
- Define a distance between tracts, based on different kinds of data
- Which tracts group together? How similar are they across the different metrics?
- Maybe clustering tracts using similarity across sums within clustered columns?
## Multivariate Analysis on Unaggregated ---
- Clustering assigns a hard label to each tract
- Multivariate analysis attempts to find a lower dimensional representation of the
data that doesn't lose too much information. What are the sourecs of maximum
variation?
### Bivarite Correlations ---
- Before full multivariate analysis, consider the largest bivariate correlations
- All crimes, and just crimes of interest
- Interpretation: The highest correlation is within data groups
+ But nonetheless very strong association across groups
+ Consider list of top correlations across crime, license, and demographics
+ List is of limited utility: It's unecessarily verbose of multiple crimes are
all correlated with each other, they will all give high correlations with license
types.
```{r}
ProcessCors <- function(data) {
cor_data <- cor(data)
rownames(cor_data) <- gsub("crime_", "", rownames(cor_data))
colnames(cor_data) <- gsub("crime_", "", colnames(cor_data))
diag(cor_data) <- 0
# get melted correlation matrix
m_cor_data <- arrange(melt(cor_data), desc(abs(value)))
# remove duplicate correlations
m_cor_data <- m_cor_data[seq(1, nrow(m_cor_data), by=2), ]
return(list(cor_data=cor_data, m_cor_data=m_cor_data))
}
# All laws
all_laws_cors <- ProcessCors(crime_census_alcohol[, c(census_cols, crime_cols, license_cols)])
head(all_laws_cors$m_cor_data)
# Alcohol and violent crime cols
alc_laws_cors <- ProcessCors(crime_census_alcohol[, c(census_cols, violent_crime_cols, alcohol_crime_cols, license_cols)])
head(alc_laws_cors$m_cor_data)
```
### Basic Clustering & Biclustering ---
- A brief digression to show the biclustering of data according to both raw data
and the correlations
- The interpretation is that we try to hierarchically cluster features, based on raw
data or correlations between them
- Groups of correlations or groups of actual data that are most similar to each other
are merged first
- This is usually just a nice preliminary visualization, not too much insight either
- Yellow is higher correlated than red
- Blocking is interesting: These are variables that can be essentially collapsed.
- But we see this pattern of crimes with each other and licenses with each other.
- Notice it's symmetric, trees are identical on left and top.
- Also, some substantial correlation across groups of cols, as expected from before
- Mention the top fiew: Median income and college. Median Income and License 85.
- 85 and 45 are merged first: These are most similar license types
- Much more interpretation is possible
+ Can do the same with tracts as one of the heatmap dimensions.
```{r}
# all laws
heatmap(all_laws_cors$cor_data)
# alcohol crimes
heatmap(alc_laws_cors$cor_data)
```
```{r, fig.height=10, fig.width=10}
heatmap(scale(crime_census_alcohol[, c(census_cols, violent_crime_cols, alcohol_crime_cols, license_cols)]))
```
### Canonical Correlation Analysis ---
- Dimension reduction method popular when we have multiple tables. It's a
generalization of PCA applicable when we have several tables
- Interpretation
+ Satisfying that crimes that seem similar to each other are labeled that way
+ Can look at tracts that are overrepresented in some kinds of crimes than others
+ Doesn't seem to really associate with either income or unemployment. There is
some unknown variation in tracts that drives this projection, but we haven't
found the feature (or groups of features) driving that, at least not at this
more cursory analysis...
```{r, fig.show='hide'}
cancor_result <- cancor(scale(crime_census_alcohol[, license_cols]),
scale(crime_census_alcohol[, c(violent_crime_cols, alcohol_crime_cols)]))
cancor_plot_data <- rbind(data.frame(cancor_result$xcoef[, 1:2], type="license", name=gsub("license_", "", license_cols)),
data.frame(cancor_result$ycoef[, 1:2], type="crime", name=gsub("crime_", "", c(violent_crime_cols, alcohol_crime_cols))))
ggplot(cancor_plot_data) + geom_text(aes(x=X1,y=X2,label=name))
```
```{r}
ggplot(cancor_plot_data) + geom_text(aes(x=X1,y=X2,label=name), size=3) +
ggtitle("Canonical correlations on table features")
```
# Detection of Anamolous Tracts ---
## Highly ranked ---
### Crime ---
- What are the tracts that have the highest violent crime, liquor related crime
+ Looking at the tracts with the highest liquor related crime rate, the top tracts are actually all clustered around the same area near the Tenderloin district.
```{r}
alcohol_crimes <- melt(crime_census_alcohol[, c("Tract2010", alcohol_crime_cols)],
id.vars="Tract2010")
alcohol_crimes <- ddply(alcohol_crimes, .(Tract2010), transform, mean_val=mean(value))
alcohol_crimes <- arrange(alcohol_crimes, desc(mean_val))[1:200, ]
ggplot(alcohol_crimes) +
geom_point(aes(x=reorder(as.factor(Tract2010), desc(value)), y=value, col=variable)) +
custom_theme() +
ggtitle("Tracts with highest liquor related crime")
```
```{r}
violent_crimes <- melt(crime_census_alcohol[, c("Tract2010", violent_crime_cols)],
id.vars="Tract2010")
violent_crimes <- ddply(violent_crimes, .(Tract2010), transform, mean_val=mean(value))
violent_crimes <- arrange(violent_crimes, desc(mean_val))[1:200, ]
ggplot(violent_crimes) +
geom_point(aes(x=reorder(as.factor(Tract2010), desc(value)), y=value, col=variable)) +
custom_theme() +
ggtitle("Tracts with highest violent crime")
```
### Liquor ---
- What are tracts with the highest liquor store densities, overall
- Interstingly, our data shows that the highest density of liquor stores are in the financial district, and the census tract (as shown by Census Tract 11700 in the plot) correpsondng to that neighborhood also rank very high in liquor related crime rate and violent crime rate. However, note that this doesn't necessary support that finanical district has highest crime rate because it has highest density of liquor stores. Since inherently, rate and density is computed by averaging the population count. Because finanical district is more like a commericial neighborhood as opposed to a residential area, this could be an outlier in our interpertation. Note we already have excluded other outlier of this sort where there's low population count, such as the Golden Gate Park census tract.
- Just on site
- Just off site
```{r}
licenses <- melt(crime_census_alcohol[, c("Tract2010", license_cols)],
id.vars="Tract2010")
licenses <- ddply(licenses, .(Tract2010), transform, mean_val=mean(value))
licenses <- arrange(licenses, desc(mean_val))[1:200, ]
ggplot(licenses) +
geom_point(aes(x=reorder(as.factor(Tract2010), desc(value)), y=value, col=variable)) +
custom_theme() +
ggtitle("Tracts with highest density of licenses")
```
## Matching: Comparing different distances ---
- When do two tracts seem very similar according to one data set but very
different according to another?
- What are these tracts?
# Limitations ---
- Sales level information
- Proxies for patrolling bias
- Integration of 311 data
- Incorporation of spatial distance function
# Appendix
## Histograms of measured Variables across tracts
```{r}
# Put these in an appendix
ggplot(melt(crime_census_alcohol[, census_cols])) +
geom_histogram(aes(x=value)) +
facet_wrap(~variable, scale="free") +
custom_theme() +
ggtitle("Demographic Information across Tracts")
# Maybe worth putting in a pairs plot as well?
ggplot(melt(crime_census_alcohol[, alcohol_crime_cols])) +
geom_histogram(aes(x=value)) +
facet_wrap(~variable, scale="free") +
scale_y_sqrt() +
custom_theme() +
ggtitle("Rates of Alcohol-Related Crimes across Tracts")
ggplot(melt(crime_census_alcohol[, setdiff(crime_cols, alcohol_crime_cols)])) +
geom_histogram(aes(x=value)) +
facet_wrap(~variable, scale="free") +
scale_y_sqrt() +
custom_theme() +
ggtitle("Rates of Alcohol-Unrelated Crimes across Tracts")
ggplot(melt(crime_census_alcohol[, license_cols])) +
geom_histogram(aes(x=value)) +
facet_wrap(~variable, scale="free") +
scale_y_sqrt() +
custom_theme() +
ggtitle("Rates of License Rates across Tracts")
```