Merge branch 'main' into update-data-output-lab

fhdsl · Oct 8, 2024 · 61338c5 · 61338c5
2 parents 9718a8f + 5c29925
commit 61338c5
Show file tree

Hide file tree

Showing 49 changed files with 2,767 additions and 9,516 deletions.
diff --git a/help.html b/help.html
@@ -353,14 +353,14 @@ <h2><strong>Why are my changes not taking effect? It’s making my results look
 <p>Here we are creating a new object from an existing one:</p>
 <pre class="r"><code>new_rivers &lt;- sample(rivers, 5)
 new_rivers</code></pre>
-<pre><code>## [1] 314 529 710 450 605</code></pre>
+<pre><code>## [1]  135  470  407  610 1171</code></pre>
 <p>Using just this will only print the result and not actually change <code>new_rivers</code>:</p>
 <pre class="r"><code>new_rivers + 1</code></pre>
-<pre><code>## [1] 315 530 711 451 606</code></pre>
+<pre><code>## [1]  136  471  408  611 1172</code></pre>
 <p>If we want to modify <code>new_rivers</code> and save that modified version, then we need to reassign <code>new_rivers</code> like so:</p>
 <pre class="r"><code>new_rivers &lt;- new_rivers + 1
 new_rivers</code></pre>
-<pre><code>## [1] 315 530 711 451 606</code></pre>
+<pre><code>## [1]  136  471  408  611 1172</code></pre>
 <p>If we forget to reassign this can cause subsequent steps to not work as expected because we will not be working with the data that has been modified.</p>
 <hr />
 </div>
@@ -409,7 +409,7 @@ <h2><strong>Error: object ‘X’ not found</strong></h2>
 <p>Make sure you run something like this, with the <code>&lt;-</code> operator:</p>
 <pre class="r"><code>rivers2 &lt;- new_rivers + 1
 rivers2</code></pre>
-<pre><code>## [1] 316 531 712 452 607</code></pre>
+<pre><code>## [1]  137  472  409  612 1173</code></pre>
 <hr />
 </div>
 <div id="error-unexpected-in-error-unexpected-in-error-unexpected-x-in" class="section level2">

diff --git a/index.html b/index.html
@@ -351,7 +351,7 @@ <h2>Testimonials from our other courses:</h2>
 <h2>Find an Error!?</h2>
 <hr />
 <p>Feel free to submit typos/errors/etc via the GitHub repository associated with the class: <a href="https://github.com/fhdsl/DaSEH" class="uri">https://github.com/fhdsl/DaSEH</a></p>
-<p>This page was last updated on 2024-10-03.</p>
+<p>This page was last updated on 2024-10-08.</p>
 <p style="text-align:center;">
 <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://live.staticflickr.com/4557/26350808799_6f9c8bcaa2_b.jpg" height="150"/> </a>
 </p>

diff --git a/modules/Data_Visualization/Data_Visualization.Rmd b/modules/Data_Visualization/Data_Visualization.Rmd
@@ -14,10 +14,10 @@ opts_chunk$set(echo = TRUE,
                fig.height = 4,
                fig.width = 7,
                comment = "")
-library(dasehr)
 library(tidyverse)
 library(tidyr)
-library(emo)
+install.packages('emoji', repos='http://cran.us.r-project.org', dependencies=TRUE)
+library(emoji)
 ```
 
 ## Recap
@@ -470,10 +470,26 @@ er_visits_4 %>% ggplot(aes(x = year,
 -   `scale_x_continuous()` and `scale_y_continuous()` can modify the scale of the axes
 -   by default, `ggplot()` removes points with missing values from plots.
 
+## GUT CHECK: If we get an empty plot what might we need to do?
+
+A. Add a `plot_` layer like `plot_point()`
+
+B. Add a `geom_` layer like `geom_point()`
+
+
+## GUT CHECK: How do we add more layers in ggplot2 plots?
+
+A. `%>%`
+
+B. `&`
+
+C. `+`
+
 ## Lab 1
 
-🏠 [Class Website](https://daseh.org/)\
-💻 [Lab](https://daseh.org/modules//Data_Visualization/lab/Data_Visualization_Lab.Rmd)
+🏠 [Class Website](https://daseh.org/)
+💻 [Lab](https://daseh.org/modules//Data_Visualization/lab/Data_Visualization_Lab.Rmd) 
+📃[Day 6 Cheatsheet](https://daseh.org/modules/cheatsheets/Day-6.pdf)
 
 ## theme() function:
 
@@ -729,7 +745,7 @@ er_bar +
 
 ## Tip - Check what you plot {.codesmall}
 
-`r emo::ji("warning")` May not be plotting what you think you are! `r emo::ji("warning")`
+`r emoji("warning")` May not be plotting what you think you are! `r emoji("warning")`
 
 ```{r, fig.width=5 , fig.height=3, fig.align='center'}
 ggplot(er_visits_4, aes(x = county,
@@ -937,6 +953,19 @@ ggplotly(lots_of_lines)
 
 Also check out the [`ggiraph` package](https://www.rdocumentation.org/packages/ggiraph/versions/0.6.1)
 
+## `patchwork` package
+
+Great for combining plots together
+
+Also check out the [`patchwork` package](https://patchwork.data-imaginist.com/)
+
+```{r, out.width= "80%", fig.align='center'}
+#install.packages("patchwork")
+library(patchwork)
+lots_of_lines + rp_fac_plot
+
+```
+
 # Saving plots
 
 ## Saving a ggplot to file
@@ -953,6 +982,20 @@ ggsave(filename = "saved_plot.png",  # will save in working directory
        width = 6, height = 3.5)               # by default in inches
 ```
 
+## GUT CHECK: How to we make sure that the boxplots are filled with color instead of just the outside boarder?
+
+A. Use the `fill` argument in the `aes` specification 
+
+B. Use `color` argument in `geom_boxplot()`
+
+## GUT CHECK: If our plot is too complicated to read, what might be a good option to fix this?
+
+A. add more `theme()` layers
+
+B. Use `facet_grid()` to split the plot up
+
+
+
 ## Summary
 
 -   The `theme()` function helps you specify aspects about your plot
@@ -973,6 +1016,8 @@ Check out this [guide](https://jhudatascience.org/tidyversecourse/dataviz.html#m
 
 🏠 [Class Website](https://daseh.org/)\
 💻 [Lab](https://daseh.org/modules//Data_Visualization/lab/Data_Visualization_Lab.Rmd)
+📃[Day 6 Cheatsheet](https://daseh.org/modules/cheatsheets/Day-6.pdf) 
+📃[Posit's theme cheatsheet](https://github.com/claragranell/ggplot2/blob/main/ggplot_theme_system_cheatsheet.pdf) 
 
 ```{r, fig.alt="The End", out.width = "50%", echo = FALSE, fig.align='center'}
 knitr::include_graphics(here::here("images/the-end-g23b994289_1280.jpg"))
@@ -1150,15 +1195,4 @@ library(directlabels)
 direct.label(lots_of_lines, method = list("angled.boxes"))
 ```
 
-## `patchwork` package
-
-Great for combining plots together
-
-Also check out the [`patchwork` package](https://patchwork.data-imaginist.com/)
 
-```{r, out.width= "50%", fig.align='center'}
-#install.packages("patchwork")
-library(patchwork)
-(plt1 + plt2)/plt2
-
-```
diff --git a/modules/Data_Visualization/Data_Visualization.html b/modules/Data_Visualization/Data_Visualization.html
diff --git a/modules/Data_Visualization/Data_Visualization.pdf b/modules/Data_Visualization/Data_Visualization.pdf
diff --git a/modules/Data_Visualization/lab/Data_Visualization_Lab.Rmd b/modules/Data_Visualization/lab/Data_Visualization_Lab.Rmd
@@ -17,30 +17,24 @@ Load the libraries
 library(readr)
 library(ggplot2)
 library(dplyr)
-library(dasehr)
 ```
 
-Open the Nitrate exposure via WA public waterways data from the `dasehr` package.
-
-(You can also access it at the link www.daseh.org/data/Nitrate_Exposure_for_WA_Public_Water_Systems_byquarter_data.csv)
-
-Then, use the provided code to compute a data frame `nitrate` with aggregate summary of exposure level: average exposed population  (`pop_exposed_to_exceedances`) for each year (`year`).
+Load the CalEnviroScreen data from  the link www.daseh.org/data/CalEnviroScreen_data.csv) and subset it so that you only have data from Fresno, Merced, Placer, Sonoma, and Yolo counties.
 
 ```{r}
-
-nitrate_agg <- nitrate %>%
-  group_by(year) %>%
-  summarise(exposed_pop_avg = mean(pop_exposed_to_exceedances))
-
-nitrate_agg
+ces <- read_csv("https://daseh.org/data/CalEnviroScreen_data.csv")
+ces_sub <- ces %>% filter(CaliforniaCounty == c("Fresno", "Merced", "Placer", "Sonoma", "Yolo"))
 ```
 
 ### 1.1
 
-Use `ggplot2` package make plot of average exposed population (`exposed_pop_avg`; y-axis) for each year (`year`; x-axis). You can use lines layer (`+ geom_line()`) or points layer (`+ geom_point()`), or both!
+Use the `ggplot2` package to make a plot of how diesel particulate concentration (`DieselPM`; y-axis) is associated with traffic density values (`Traffic`; x-axis). You can use lines layer (`+ geom_line()`) or points layer (`+ geom_point()`), or both!
 
 Assign the plot to variable `my_plot`. Type `my_plot` in the console to have it displayed.
 
+`DieselPM`: Diesel PM emissions from on-road and non-road sources
+`Traffic`: Traffic density in vehicle-kilometers per hour per road length, within 150 meters of the census tract boundary
+
 ```
 # General format
 ggplot(???, aes(x = ???, y = ???)) +
@@ -62,7 +56,8 @@ ggplot(???, aes(x = ???, y = ???)) +
 
 ### 1.3
 
-Use the `scale_x_continuous()` function to plot the x axis with the following breaks `c(1999, 2001, 2003, 2005, 2007, 2009, 2011, 2013, 2015, 2017, 2019)`.
+Use the `scale_x_continuous()` function to plot the x axis with the following breaks `c(250, 750, 1250, 1750, 2250)`. 
+
 
 ```
 # General format
@@ -92,7 +87,10 @@ my_plot + theme_bw()
 
 ### P.1
 
-Create a boxplot (with the `geom_boxplot()` function) using the `nitrate` data, where `quarter` is plotted on the x axis and `pop_on_sampled_PWS` is plotted on the y axis.
+Create a boxplot (with the `geom_boxplot()` function) using the `ces_sub` data, where `CaliforniaCounty` is plotted on the x axis and `DrinkingWater` is plotted on the y axis. 
+
+`DrinkingWater`: Drinking water contaminant index for selected contaminants. A higher value means drinking water contains a greater volume of contaminants.
+
 
 ```{r P1response}
 
@@ -102,21 +100,10 @@ Create a boxplot (with the `geom_boxplot()` function) using the `nitrate` data,
 # Part 2
 
 ### 2.1
+Let's look at the plot of traffic density and diesel particulate matter again,
 
-Use the provided code to compute a data frame `nitrate_agg_2` with aggregate summary of WA Nitrate data: population exposed to less than 10 ug/L of nitrate in the water (sum of `pop_0-3ug/L`, `pop_>3-5ug/L`, and `pop_>5-10ug/L`) -- separately for each year (`year`) and for each quarter (`quarter`.
-
-```{r}
-
-nitrate_agg_2 <- nitrate %>%
-  group_by(year, quarter) %>%
-  summarise(pop_less_than_10ug_perL = sum(`pop_0-3ug/L`, `pop_>3-5ug/L`, `pop_>5-10ug/L`))
-
-nitrate_agg_2
-```
-
-### 2.2
+Use `ggplot2` package make plot of how diesel particulate concentration (`DieselPM`; y-axis) is associated with traffic density values (`Traffic`; x-axis), where each county (`CaliforniaCounty`) has a different color (hint: use `color = type` in mapping).
 
-Use `ggplot2` package to make a plot showing trajectories of total population exposed to less than 10 ug/L of nitrate (`pop_less_than_10ug_perL`; y-axis) over year (`year`; x-axis), where each quarter type has a different color (hint: use `color = type` in mapping).
 
 ```
 # General format
@@ -129,25 +116,26 @@ ggplot(???, aes(
   geom_point()
 ```
 
-```{r 2.2response}
+```{r 2.1response}
 
 ```
 
-### 2.3
+### 2.2
+
+Redo the above plot by adding a faceting  (`+ facet_wrap( ~ CaliforniaCounty, ncol = 3)`) to have data for quarter in a separate plot panel. 
 
-Redo the above plot by adding a faceting  (`+ facet_wrap( ~ quarter, ncol = 2)`) to have data for quarter in a separate plot panel.
 
 Assign the new plot as an object called `facet_plot`.
 
-```{r 2.3response}
+```{r 2.2response}
 
 ```
 
-### 2.4
+### 2.3
 
 Observe what happens when you remove either `geom_line()` OR `geom_point()` from one of your plots above.
 
-```{r 2.4response}
+```{r 2.3response}
 
 ```
 
@@ -156,7 +144,8 @@ Observe what happens when you remove either `geom_line()` OR `geom_point()` from
 
 ### P.2
 
-Modify `facet_plot` to remove the legend (hint use `theme()` and the `legend.position` argument) and change the names of the axis titles to be "Population exposed to less than 10 ug/L of nitrate in water" for the y axis and "Year" for the x axis.
+Modify `facet_plot` to remove the legend (hint use `theme()` and the `legend.position` argument) and change the names of the axis titles to be "Diesel particulate matter" for the y axis and "Traffic density" for the x axis. 
+
 
 ```{r P.2response}