-
Notifications
You must be signed in to change notification settings - Fork 51
/
07-Model.Rmd
executable file
·100 lines (60 loc) · 2.63 KB
/
07-Model.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
---
title: "Model"
output: html_notebook
editor_options:
chunk_output_type: inline
---
<!-- This file by Charlotte Wickham is licensed under a Creative Commons Attribution 4.0 International License, adapted from the orignal work at https://github.com/rstudio/master-the-tidyverse by RStudio. -->
```{r setup, message=FALSE}
library(tidyverse)
library(modelr)
library(broom)
wages <- heights %>% filter(income > 0)
```
## Your Turn 1
Fit the model on the slide and then examine the output. What does it look like?
```{r}
mod_e <- lm(log(income) ~ education, data = wages)
mod_e
```
## Your Turn 2
Use a pipe to model `log(income)` against `height`. Then use broom and dplyr functions to extract:
1. The **coefficient estimates** and their related statistics
2. The **adj.r.squared** and **p.value** for the overall model
```{r, error = TRUE}
mod_h <- wages %>% lm( )
```
## Your Turn 3
Model `log(income)` against `education` _and_ `height`. Do the coefficients change?
```{r, error = TRUE}
mod_eh <- wages %>% lm( )
```
## Your Turn 4
Model `log(income)` against `education` and `height` and `sex`. Can you interpret the coefficients?
```{r, error = TRUE}
mod_ehs <- wages %>% lm( )
```
## Your Turn 5
Use a broom function and ggplot2 to make a line graph of `height` vs `.fitted` for our heights model, `mod_h`.
_Bonus: Overlay the plot on the original data points._
```{r}
mod_h <- wages %>% lm(log(income) ~ height, data = .)
```
## Your Turn 6
Repeat the process to make a line graph of `height` vs `.fitted` colored by `sex` for model `mod_ehs`. Are the results interpretable? Add `+ facet_wrap(~education)` to the end of your code. What happens?
```{r}
mod_ehs <- wages %>% lm(log(income) ~ education + height + sex, data = .)
```
## Your Turn 7
Use one of `spread_predictions()` or `gather_predictions()` to make a line graph of `height` vs `pred` colored by `model` for each of mod_h, mod_eh, and mod_ehs. Are the results interpretable?
Add `+ facet_grid(sex ~ education)` to the end of your code. What happens?
```{r warning = FALSE, message = FALSE}
mod_h <- wages %>% lm(log(income) ~ height, data = .)
mod_eh <- wages %>% lm(log(income) ~ education + height, data = .)
mod_ehs <- wages %>% lm(log(income) ~ education + height + sex, data = .)
```
***
# Take Aways
* Use `glance()`, `tidy()`, and `augment()` from the **broom** package to return model values in a data frame.
* Use `add_predictions()` or `gather_predictions()` or `spread_predictions()` from the **modelr** package to visualize predictions.
* Use `add_residuals()` or `gather_residuals()` or `spread_residuals()` from the **modelr** package to visualize residuals.