From 6c059016be4254d0bde52ca392593f923bd52631 Mon Sep 17 00:00:00 2001 From: carriewright11 Date: Mon, 30 Sep 2024 17:59:27 -0400 Subject: [PATCH 1/5] adding tiny bit --- modules/Reproducibility/Reproducibility.Rmd | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/modules/Reproducibility/Reproducibility.Rmd b/modules/Reproducibility/Reproducibility.Rmd index 134546b1..562cb473 100644 --- a/modules/Reproducibility/Reproducibility.Rmd +++ b/modules/Reproducibility/Reproducibility.Rmd @@ -67,6 +67,12 @@ knitr::include_graphics("images/reproducibility.png") ottrpal::include_slide("https://docs.google.com/presentation/d/1nV7x0mIIE4oWVKxpv4qJNvO17y51MajsERGtzR2qClk/edit#slide=id.gf1accd298e_0_673") ``` +## We can't get to replicability without reproducibility + +```{r, fig.alt="session info", out.width = "80%", echo = FALSE, fig.align='center'} +ottrpal::include_slide("https://docs.google.com/presentation/d/1nV7x0mIIE4oWVKxpv4qJNvO17y51MajsERGtzR2qClk/edit#slide=id.g3070a1ee60e_0_0") +``` + ## It's worth the wait ```{r, fig.alt="session info", out.width = "80%", echo = FALSE, fig.align='center'} @@ -80,6 +86,7 @@ ottrpal::include_slide("https://docs.google.com/presentation/d/1nV7x0mIIE4oWVKxp ottrpal::include_slide("https://docs.google.com/presentation/d/1nV7x0mIIE4oWVKxpv4qJNvO17y51MajsERGtzR2qClk/edit#slide=id.gf1cd772e00_0_330") ``` + ## The process ```{r, fig.alt="session info", out.width = "80%", echo = FALSE, fig.align='center'} From c07232731cad16eb0f5f0e11c5fe24a26e0c8244 Mon Sep 17 00:00:00 2001 From: carriewright11 Date: Tue, 1 Oct 2024 22:44:45 -0400 Subject: [PATCH 2/5] cutting a bit of subsetting --- .../Subsetting_Data_in_R.Rmd | 54 ++++++++++--------- 1 file changed, 29 insertions(+), 25 deletions(-) diff --git a/modules/Subsetting_Data_in_R/Subsetting_Data_in_R.Rmd b/modules/Subsetting_Data_in_R/Subsetting_Data_in_R.Rmd index 506141df..3e6d3a16 100644 --- a/modules/Subsetting_Data_in_R/Subsetting_Data_in_R.Rmd +++ b/modules/Subsetting_Data_in_R/Subsetting_Data_in_R.Rmd @@ -472,6 +472,7 @@ knitr::include_graphics("images/tidyselect.png") head(er_30, 2) select(er_30, ends_with("cl"), year) ``` + ## Multiple tidyselect functions Follows OR logic. @@ -481,14 +482,7 @@ select(er_30, ends_with("cl"), starts_with("r")) ``` -## Multiple patterns with tidyselect - -Need to combine the patterns with the `c()` function. - -```{r} -select(er_30, starts_with(c("r", "l"))) -``` ## The `where()` function can help select columns of a specific class{.codesmall} @@ -816,32 +810,23 @@ select(er_30, newcol, everything()) head(select(er_30, newcol, everything()), 3) ``` -## Ordering the columns of a data frame: dplyr {.codesmall} - -Put `year` at the end ("remove, everything, then add back in"): - -```{r, eval = FALSE} -select(er_30, !year, everything(), year) -``` + -```{r, echo = FALSE} -head(select(er_30, !year, everything(), year), 3) -``` + + + + -## Ordering the column names of a data frame: alphabetically {.codesmall} + + + -Using the base R `order()` function. - -```{r} -order(colnames(er_30)) -er_30 %>% select(order(colnames(er_30))) -``` ## Ordering the columns of a data frame: dplyr {.codesmall} -In addition to `select` we can also use the `relocate()` function of dplyr to rearrange the columns for more complicated moves. +In addition to `select` we can also use the `relocate()` function of dplyr to rearrange the columns for more complicated moves with the `.before` and `.after` arguments. For example, let say we just wanted `year` to be before `rate``. @@ -951,6 +936,25 @@ Image by % select(order(colnames(er_30))) +``` + ## `which()` function Instead of removing rows like filter, `which()` simply shows where the values occur if they pass a specific condition. We will see that this can be helpful later when we want to select and filter in more complicated ways. From e35baafe5faf1d0a53a2ff80d36edebaa73fd81f Mon Sep 17 00:00:00 2001 From: carriewright11 Date: Tue, 1 Oct 2024 22:48:48 -0400 Subject: [PATCH 3/5] adding cheatsheets --- modules/Subsetting_Data_in_R/Subsetting_Data_in_R.Rmd | 3 +++ 1 file changed, 3 insertions(+) diff --git a/modules/Subsetting_Data_in_R/Subsetting_Data_in_R.Rmd b/modules/Subsetting_Data_in_R/Subsetting_Data_in_R.Rmd index 3e6d3a16..0196b49b 100644 --- a/modules/Subsetting_Data_in_R/Subsetting_Data_in_R.Rmd +++ b/modules/Subsetting_Data_in_R/Subsetting_Data_in_R.Rmd @@ -926,6 +926,9 @@ Even though `$` is easier for creating new columns, `mutate` is really powerful, 💻 [Lab](https://daseh.org/modules/Subsetting_Data_in_R/lab/Subsetting_Data_in_R_Lab.Rmd) +🗒 [Day 3 Cheatsheet](https://daseh.org/modules/cheatsheets/Day-3.pdf) + +🗒 [RStudio `dplyr` Cheatsheet](https://rstudio.github.io/cheatsheets/data-transformation.pdf) ```{r, fig.alt="The End", out.width = "50%", echo = FALSE, fig.align='center'} From ac08082be656aa0857fbb881d29b31155339e0db Mon Sep 17 00:00:00 2001 From: carriewright11 Date: Tue, 1 Oct 2024 23:12:37 -0400 Subject: [PATCH 4/5] adding gut checks --- .../Subsetting_Data_in_R.Rmd | 57 ++++++++++++++++++- 1 file changed, 54 insertions(+), 3 deletions(-) diff --git a/modules/Subsetting_Data_in_R/Subsetting_Data_in_R.Rmd b/modules/Subsetting_Data_in_R/Subsetting_Data_in_R.Rmd index 0196b49b..c84e4482 100644 --- a/modules/Subsetting_Data_in_R/Subsetting_Data_in_R.Rmd +++ b/modules/Subsetting_Data_in_R/Subsetting_Data_in_R.Rmd @@ -30,7 +30,7 @@ We are constantly making improvements. - Reproducible science makes everyone's life easier! - `readr`has helpful functions like `read_csv()` that can help you import data into R -📃[Cheatsheet](https://daseh.org/modules/cheatsheets/Day-2.pdf) +📃[Day 2 Cheatsheet](https://daseh.org/modules/cheatsheets/Day-2.pdf) ## Overview @@ -382,6 +382,15 @@ test clean_names(test) ``` +## GUT CHECK: Which of the following would NOT always work with a column called `counties_of_seattle_with_population_over_10,000`? + +A. Renaming it using `rename` function to something simpler like `seattle_counties_over_10thous`. + +B. Keeping it as is and use backticks around the column name when you use it. + +C. Keeping it as is and use quotes around the column name when you use it. + + ## Summary - data frames are simpler version of a data table @@ -396,8 +405,14 @@ clean_names(test) ## Lab Part 1 -🏠 [Class Website](https://daseh.org/) -💻 [Lab](https://daseh.org/modules/Subsetting_Data_in_R/lab/Subsetting_Data_in_R_Lab.Rmd) +🏠 [Class Website](https://daseh.org/) + +💻 [Lab](https://daseh.org/modules/Subsetting_Data_in_R/lab/Subsetting_Data_in_R_Lab.Rmd) + +🗒 [Day 3 Cheatsheet](https://daseh.org/modules/cheatsheets/Day-3.pdf) + +🗒 [RStudio `dplyr` Cheatsheet](https://rstudio.github.io/cheatsheets/data-transformation.pdf) + # Subsetting Columns @@ -495,6 +510,11 @@ select(er_30, where(is.numeric)) ``` +## GUT CHECK: What function would be useful for getting a vector version of a column? + +A. `pull()` + +B. `select()` @@ -630,6 +650,17 @@ knitr::include_graphics("https://media.giphy.com/media/5b5OU7aUekfdSAER5I/giphy. ``` https://media.giphy.com/media/5b5OU7aUekfdSAER5I/giphy.gif + +## GUT CHECK: If we want to keep just rows that meet either or two conditions, what code should we use? + +A. `filter()` with `|` + +B. `select()` with `|` + +C. `filter()` with `&` + +D. `select()` with `&` + ## Summary - `pull()` to get values out of a data frame/tibble @@ -651,6 +682,9 @@ https://media.giphy.com/media/5b5OU7aUekfdSAER5I/giphy.gif 🏠 [Class Website](https://daseh.org) 💻 [Lab](https://daseh.org/modules/Subsetting_Data_in_R/lab/Subsetting_Data_in_R_Lab.Rmd) +🗒 [Day 3 Cheatsheet](https://daseh.org/modules/cheatsheets/Day-3.pdf) + +🗒 [RStudio `dplyr` Cheatsheet](https://rstudio.github.io/cheatsheets/data-transformation.pdf) ## Get the data @@ -866,6 +900,23 @@ arrange(er_30, rate, desc(year)) %>% head(n = 2) arrange(er_30, desc(year), rate) %>% head(n = 2) ``` +## GUT CHECK: What function would be useful for changing a column to be a percentage instead of a ratio? + +A. `filter()` + +B. `select()` + +C. `mutate()` + + +## GUT CHECK: How would we interpret `er_30 %>% filter(year > 2020) %>% select(year, rate)`? + +A. Get the `er_30` data, then filter it for rows with `year` values over 2020, then select only the `year` and `rate` columns. + +B. Get the `er_30` data, then filter it rows with `year` values over 2020, then select for rows that have values for `year` and `rate`. + + + ## Summary From 50d1aa13c294ef8a398b8f81b30d663caaf7830e Mon Sep 17 00:00:00 2001 From: carriewright11 Date: Tue, 1 Oct 2024 23:20:51 -0400 Subject: [PATCH 5/5] changing from rstudio to Posit for cheatsheets... although I cut this from the rstudio lecture... can add this back --- .../Subsetting_Data_in_R/Subsetting_Data_in_R.Rmd | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/modules/Subsetting_Data_in_R/Subsetting_Data_in_R.Rmd b/modules/Subsetting_Data_in_R/Subsetting_Data_in_R.Rmd index c84e4482..b169dd0e 100644 --- a/modules/Subsetting_Data_in_R/Subsetting_Data_in_R.Rmd +++ b/modules/Subsetting_Data_in_R/Subsetting_Data_in_R.Rmd @@ -30,7 +30,7 @@ We are constantly making improvements. - Reproducible science makes everyone's life easier! - `readr`has helpful functions like `read_csv()` that can help you import data into R -📃[Day 2 Cheatsheet](https://daseh.org/modules/cheatsheets/Day-2.pdf) +📃 [Day 2 Cheatsheet](https://daseh.org/modules/cheatsheets/Day-2.pdf) ## Overview @@ -409,9 +409,9 @@ C. Keeping it as is and use quotes around the column name when you use it. 💻 [Lab](https://daseh.org/modules/Subsetting_Data_in_R/lab/Subsetting_Data_in_R_Lab.Rmd) -🗒 [Day 3 Cheatsheet](https://daseh.org/modules/cheatsheets/Day-3.pdf) +📃 [Day 3 Cheatsheet](https://daseh.org/modules/cheatsheets/Day-3.pdf) -🗒 [RStudio `dplyr` Cheatsheet](https://rstudio.github.io/cheatsheets/data-transformation.pdf) +📃 [Posit's `dplyr` Cheatsheet](https://rstudio.github.io/cheatsheets/data-transformation.pdf) # Subsetting Columns @@ -682,9 +682,9 @@ D. `select()` with `&` 🏠 [Class Website](https://daseh.org) 💻 [Lab](https://daseh.org/modules/Subsetting_Data_in_R/lab/Subsetting_Data_in_R_Lab.Rmd) -🗒 [Day 3 Cheatsheet](https://daseh.org/modules/cheatsheets/Day-3.pdf) +📃 [Day 3 Cheatsheet](https://daseh.org/modules/cheatsheets/Day-3.pdf) -🗒 [RStudio `dplyr` Cheatsheet](https://rstudio.github.io/cheatsheets/data-transformation.pdf) +📃 [Posit's `dplyr` Cheatsheet](https://rstudio.github.io/cheatsheets/data-transformation.pdf) ## Get the data @@ -977,9 +977,9 @@ Even though `$` is easier for creating new columns, `mutate` is really powerful, 💻 [Lab](https://daseh.org/modules/Subsetting_Data_in_R/lab/Subsetting_Data_in_R_Lab.Rmd) -🗒 [Day 3 Cheatsheet](https://daseh.org/modules/cheatsheets/Day-3.pdf) +📃 [Day 3 Cheatsheet](https://daseh.org/modules/cheatsheets/Day-3.pdf) -🗒 [RStudio `dplyr` Cheatsheet](https://rstudio.github.io/cheatsheets/data-transformation.pdf) +📃 [Posit's `dplyr` Cheatsheet](https://rstudio.github.io/cheatsheets/data-transformation.pdf) ```{r, fig.alt="The End", out.width = "50%", echo = FALSE, fig.align='center'}