Skip to content

Commit

Permalink
Merge branch 'main' into update-data-output-lab
Browse files Browse the repository at this point in the history
  • Loading branch information
avahoffman committed Oct 8, 2024
2 parents 9718a8f + 5c29925 commit 61338c5
Show file tree
Hide file tree
Showing 49 changed files with 2,767 additions and 9,516 deletions.
8 changes: 4 additions & 4 deletions help.html
Original file line number Diff line number Diff line change
Expand Up @@ -353,14 +353,14 @@ <h2><strong>Why are my changes not taking effect? It’s making my results look
<p>Here we are creating a new object from an existing one:</p>
<pre class="r"><code>new_rivers &lt;- sample(rivers, 5)
new_rivers</code></pre>
<pre><code>## [1] 314 529 710 450 605</code></pre>
<pre><code>## [1] 135 470 407 610 1171</code></pre>
<p>Using just this will only print the result and not actually change <code>new_rivers</code>:</p>
<pre class="r"><code>new_rivers + 1</code></pre>
<pre><code>## [1] 315 530 711 451 606</code></pre>
<pre><code>## [1] 136 471 408 611 1172</code></pre>
<p>If we want to modify <code>new_rivers</code> and save that modified version, then we need to reassign <code>new_rivers</code> like so:</p>
<pre class="r"><code>new_rivers &lt;- new_rivers + 1
new_rivers</code></pre>
<pre><code>## [1] 315 530 711 451 606</code></pre>
<pre><code>## [1] 136 471 408 611 1172</code></pre>
<p>If we forget to reassign this can cause subsequent steps to not work as expected because we will not be working with the data that has been modified.</p>
<hr />
</div>
Expand Down Expand Up @@ -409,7 +409,7 @@ <h2><strong>Error: object ‘X’ not found</strong></h2>
<p>Make sure you run something like this, with the <code>&lt;-</code> operator:</p>
<pre class="r"><code>rivers2 &lt;- new_rivers + 1
rivers2</code></pre>
<pre><code>## [1] 316 531 712 452 607</code></pre>
<pre><code>## [1] 137 472 409 612 1173</code></pre>
<hr />
</div>
<div id="error-unexpected-in-error-unexpected-in-error-unexpected-x-in" class="section level2">
Expand Down
2 changes: 1 addition & 1 deletion index.html
Original file line number Diff line number Diff line change
Expand Up @@ -351,7 +351,7 @@ <h2>Testimonials from our other courses:</h2>
<h2>Find an Error!?</h2>
<hr />
<p>Feel free to submit typos/errors/etc via the GitHub repository associated with the class: <a href="https://github.com/fhdsl/DaSEH" class="uri">https://github.com/fhdsl/DaSEH</a></p>
<p>This page was last updated on 2024-10-03.</p>
<p>This page was last updated on 2024-10-08.</p>
<p style="text-align:center;">
<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://live.staticflickr.com/4557/26350808799_6f9c8bcaa2_b.jpg" height="150"/> </a>
</p>
Expand Down
66 changes: 50 additions & 16 deletions modules/Data_Visualization/Data_Visualization.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,10 @@ opts_chunk$set(echo = TRUE,
fig.height = 4,
fig.width = 7,
comment = "")
library(dasehr)
library(tidyverse)
library(tidyr)
library(emo)
install.packages('emoji', repos='http://cran.us.r-project.org', dependencies=TRUE)
library(emoji)
```

## Recap
Expand Down Expand Up @@ -470,10 +470,26 @@ er_visits_4 %>% ggplot(aes(x = year,
- `scale_x_continuous()` and `scale_y_continuous()` can modify the scale of the axes
- by default, `ggplot()` removes points with missing values from plots.

## GUT CHECK: If we get an empty plot what might we need to do?

A. Add a `plot_` layer like `plot_point()`

B. Add a `geom_` layer like `geom_point()`


## GUT CHECK: How do we add more layers in ggplot2 plots?

A. `%>%`

B. `&`

C. `+`

## Lab 1

🏠 [Class Website](https://daseh.org/)\
💻 [Lab](https://daseh.org/modules//Data_Visualization/lab/Data_Visualization_Lab.Rmd)
🏠 [Class Website](https://daseh.org/)
💻 [Lab](https://daseh.org/modules//Data_Visualization/lab/Data_Visualization_Lab.Rmd)
📃[Day 6 Cheatsheet](https://daseh.org/modules/cheatsheets/Day-6.pdf)

## theme() function:

Expand Down Expand Up @@ -729,7 +745,7 @@ er_bar +

## Tip - Check what you plot {.codesmall}

`r emo::ji("warning")` May not be plotting what you think you are! `r emo::ji("warning")`
`r emoji("warning")` May not be plotting what you think you are! `r emoji("warning")`

```{r, fig.width=5 , fig.height=3, fig.align='center'}
ggplot(er_visits_4, aes(x = county,
Expand Down Expand Up @@ -937,6 +953,19 @@ ggplotly(lots_of_lines)

Also check out the [`ggiraph` package](https://www.rdocumentation.org/packages/ggiraph/versions/0.6.1)

## `patchwork` package

Great for combining plots together

Also check out the [`patchwork` package](https://patchwork.data-imaginist.com/)

```{r, out.width= "80%", fig.align='center'}
#install.packages("patchwork")
library(patchwork)
lots_of_lines + rp_fac_plot
```

# Saving plots

## Saving a ggplot to file
Expand All @@ -953,6 +982,20 @@ ggsave(filename = "saved_plot.png", # will save in working directory
width = 6, height = 3.5) # by default in inches
```

## GUT CHECK: How to we make sure that the boxplots are filled with color instead of just the outside boarder?

A. Use the `fill` argument in the `aes` specification

B. Use `color` argument in `geom_boxplot()`

## GUT CHECK: If our plot is too complicated to read, what might be a good option to fix this?

A. add more `theme()` layers

B. Use `facet_grid()` to split the plot up



## Summary

- The `theme()` function helps you specify aspects about your plot
Expand All @@ -973,6 +1016,8 @@ Check out this [guide](https://jhudatascience.org/tidyversecourse/dataviz.html#m

🏠 [Class Website](https://daseh.org/)\
💻 [Lab](https://daseh.org/modules//Data_Visualization/lab/Data_Visualization_Lab.Rmd)
📃[Day 6 Cheatsheet](https://daseh.org/modules/cheatsheets/Day-6.pdf)
📃[Posit's theme cheatsheet](https://github.com/claragranell/ggplot2/blob/main/ggplot_theme_system_cheatsheet.pdf)

```{r, fig.alt="The End", out.width = "50%", echo = FALSE, fig.align='center'}
knitr::include_graphics(here::here("images/the-end-g23b994289_1280.jpg"))
Expand Down Expand Up @@ -1150,15 +1195,4 @@ library(directlabels)
direct.label(lots_of_lines, method = list("angled.boxes"))
```

## `patchwork` package

Great for combining plots together

Also check out the [`patchwork` package](https://patchwork.data-imaginist.com/)

```{r, out.width= "50%", fig.align='center'}
#install.packages("patchwork")
library(patchwork)
(plt1 + plt2)/plt2
```
390 changes: 201 additions & 189 deletions modules/Data_Visualization/Data_Visualization.html

Large diffs are not rendered by default.

Binary file modified modules/Data_Visualization/Data_Visualization.pdf
Binary file not shown.
59 changes: 24 additions & 35 deletions modules/Data_Visualization/lab/Data_Visualization_Lab.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -17,30 +17,24 @@ Load the libraries
library(readr)
library(ggplot2)
library(dplyr)
library(dasehr)
```

Open the Nitrate exposure via WA public waterways data from the `dasehr` package.

(You can also access it at the link www.daseh.org/data/Nitrate_Exposure_for_WA_Public_Water_Systems_byquarter_data.csv)

Then, use the provided code to compute a data frame `nitrate` with aggregate summary of exposure level: average exposed population (`pop_exposed_to_exceedances`) for each year (`year`).
Load the CalEnviroScreen data from the link www.daseh.org/data/CalEnviroScreen_data.csv) and subset it so that you only have data from Fresno, Merced, Placer, Sonoma, and Yolo counties.

```{r}
nitrate_agg <- nitrate %>%
group_by(year) %>%
summarise(exposed_pop_avg = mean(pop_exposed_to_exceedances))
nitrate_agg
ces <- read_csv("https://daseh.org/data/CalEnviroScreen_data.csv")
ces_sub <- ces %>% filter(CaliforniaCounty == c("Fresno", "Merced", "Placer", "Sonoma", "Yolo"))
```

### 1.1

Use `ggplot2` package make plot of average exposed population (`exposed_pop_avg`; y-axis) for each year (`year`; x-axis). You can use lines layer (`+ geom_line()`) or points layer (`+ geom_point()`), or both!
Use the `ggplot2` package to make a plot of how diesel particulate concentration (`DieselPM`; y-axis) is associated with traffic density values (`Traffic`; x-axis). You can use lines layer (`+ geom_line()`) or points layer (`+ geom_point()`), or both!

Assign the plot to variable `my_plot`. Type `my_plot` in the console to have it displayed.

`DieselPM`: Diesel PM emissions from on-road and non-road sources
`Traffic`: Traffic density in vehicle-kilometers per hour per road length, within 150 meters of the census tract boundary

```
# General format
ggplot(???, aes(x = ???, y = ???)) +
Expand All @@ -62,7 +56,8 @@ ggplot(???, aes(x = ???, y = ???)) +

### 1.3

Use the `scale_x_continuous()` function to plot the x axis with the following breaks `c(1999, 2001, 2003, 2005, 2007, 2009, 2011, 2013, 2015, 2017, 2019)`.
Use the `scale_x_continuous()` function to plot the x axis with the following breaks `c(250, 750, 1250, 1750, 2250)`.


```
# General format
Expand Down Expand Up @@ -92,7 +87,10 @@ my_plot + theme_bw()

### P.1

Create a boxplot (with the `geom_boxplot()` function) using the `nitrate` data, where `quarter` is plotted on the x axis and `pop_on_sampled_PWS` is plotted on the y axis.
Create a boxplot (with the `geom_boxplot()` function) using the `ces_sub` data, where `CaliforniaCounty` is plotted on the x axis and `DrinkingWater` is plotted on the y axis.

`DrinkingWater`: Drinking water contaminant index for selected contaminants. A higher value means drinking water contains a greater volume of contaminants.


```{r P1response}
Expand All @@ -102,21 +100,10 @@ Create a boxplot (with the `geom_boxplot()` function) using the `nitrate` data,
# Part 2

### 2.1
Let's look at the plot of traffic density and diesel particulate matter again,

Use the provided code to compute a data frame `nitrate_agg_2` with aggregate summary of WA Nitrate data: population exposed to less than 10 ug/L of nitrate in the water (sum of `pop_0-3ug/L`, `pop_>3-5ug/L`, and `pop_>5-10ug/L`) -- separately for each year (`year`) and for each quarter (`quarter`.

```{r}
nitrate_agg_2 <- nitrate %>%
group_by(year, quarter) %>%
summarise(pop_less_than_10ug_perL = sum(`pop_0-3ug/L`, `pop_>3-5ug/L`, `pop_>5-10ug/L`))
nitrate_agg_2
```

### 2.2
Use `ggplot2` package make plot of how diesel particulate concentration (`DieselPM`; y-axis) is associated with traffic density values (`Traffic`; x-axis), where each county (`CaliforniaCounty`) has a different color (hint: use `color = type` in mapping).

Use `ggplot2` package to make a plot showing trajectories of total population exposed to less than 10 ug/L of nitrate (`pop_less_than_10ug_perL`; y-axis) over year (`year`; x-axis), where each quarter type has a different color (hint: use `color = type` in mapping).

```
# General format
Expand All @@ -129,25 +116,26 @@ ggplot(???, aes(
geom_point()
```

```{r 2.2response}
```{r 2.1response}
```

### 2.3
### 2.2

Redo the above plot by adding a faceting (`+ facet_wrap( ~ CaliforniaCounty, ncol = 3)`) to have data for quarter in a separate plot panel.

Redo the above plot by adding a faceting (`+ facet_wrap( ~ quarter, ncol = 2)`) to have data for quarter in a separate plot panel.

Assign the new plot as an object called `facet_plot`.

```{r 2.3response}
```{r 2.2response}
```

### 2.4
### 2.3

Observe what happens when you remove either `geom_line()` OR `geom_point()` from one of your plots above.

```{r 2.4response}
```{r 2.3response}
```

Expand All @@ -156,7 +144,8 @@ Observe what happens when you remove either `geom_line()` OR `geom_point()` from

### P.2

Modify `facet_plot` to remove the legend (hint use `theme()` and the `legend.position` argument) and change the names of the axis titles to be "Population exposed to less than 10 ug/L of nitrate in water" for the y axis and "Year" for the x axis.
Modify `facet_plot` to remove the legend (hint use `theme()` and the `legend.position` argument) and change the names of the axis titles to be "Diesel particulate matter" for the y axis and "Traffic density" for the x axis.


```{r P.2response}
Expand Down
Loading

0 comments on commit 61338c5

Please sign in to comment.