Skip to content

Commit

Permalink
Render site
Browse files Browse the repository at this point in the history
  • Loading branch information
github-actions[bot] committed Oct 2, 2024
1 parent ed45e6c commit 6a36ef5
Show file tree
Hide file tree
Showing 6 changed files with 630 additions and 3,969 deletions.
8 changes: 4 additions & 4 deletions help.html
Original file line number Diff line number Diff line change
Expand Up @@ -353,14 +353,14 @@ <h2><strong>Why are my changes not taking effect? It’s making my results look
<p>Here we are creating a new object from an existing one:</p>
<pre class="r"><code>new_rivers &lt;- sample(rivers, 5)
new_rivers</code></pre>
<pre><code>## [1] 625 411 250 490 270</code></pre>
<pre><code>## [1] 600 671 260 210 760</code></pre>
<p>Using just this will only print the result and not actually change <code>new_rivers</code>:</p>
<pre class="r"><code>new_rivers + 1</code></pre>
<pre><code>## [1] 626 412 251 491 271</code></pre>
<pre><code>## [1] 601 672 261 211 761</code></pre>
<p>If we want to modify <code>new_rivers</code> and save that modified version, then we need to reassign <code>new_rivers</code> like so:</p>
<pre class="r"><code>new_rivers &lt;- new_rivers + 1
new_rivers</code></pre>
<pre><code>## [1] 626 412 251 491 271</code></pre>
<pre><code>## [1] 601 672 261 211 761</code></pre>
<p>If we forget to reassign this can cause subsequent steps to not work as expected because we will not be working with the data that has been modified.</p>
<hr />
</div>
Expand Down Expand Up @@ -409,7 +409,7 @@ <h2><strong>Error: object ‘X’ not found</strong></h2>
<p>Make sure you run something like this, with the <code>&lt;-</code> operator:</p>
<pre class="r"><code>rivers2 &lt;- new_rivers + 1
rivers2</code></pre>
<pre><code>## [1] 627 413 252 492 272</code></pre>
<pre><code>## [1] 602 673 262 212 762</code></pre>
<hr />
</div>
<div id="error-unexpected-in-error-unexpected-in-error-unexpected-x-in" class="section level2">
Expand Down
2 changes: 1 addition & 1 deletion index.html
Original file line number Diff line number Diff line change
Expand Up @@ -351,7 +351,7 @@ <h2>Testimonials from our other courses:</h2>
<h2>Find an Error!?</h2>
<hr />
<p>Feel free to submit typos/errors/etc via the GitHub repository associated with the class: <a href="https://github.com/fhdsl/DaSEH" class="uri">https://github.com/fhdsl/DaSEH</a></p>
<p>This page was last updated on 2024-10-01.</p>
<p>This page was last updated on 2024-10-02.</p>
<p style="text-align:center;">
<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://live.staticflickr.com/4557/26350808799_6f9c8bcaa2_b.jpg" height="150"/> </a>
</p>
Expand Down
3,987 changes: 440 additions & 3,547 deletions modules/Data_Summarization/Data_Summarization.html

Large diffs are not rendered by default.

48 changes: 20 additions & 28 deletions modules/Data_Summarization/lab/Data_Summarization_Lab.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -11,45 +11,38 @@ knitr::opts_chunk$set(echo = TRUE)

# Part 1

Data used

CalEnviroScreen Dataset: CalEnviroScreen is a project that ranks census tracts in California based on potential exposures to pollutants, adverse environmental conditions, socioeconomic factors and the prevalence of certain health conditions. Data used in the CalEnviroScreen model come from national and state sources.

The data is from https://calenviroscreen-oehha.hub.arcgis.com/#Data

You can Download as a CSV in your current working directory. Note its also available at: https://daseh.org/data/CalEnviroScreen_data.csv
We'll again use the CalEnviroScreen dataset for the lab. Load the `tidyverse` package and the dataset, which can be found at https://daseh.org/data/CalEnviroScreen_data.csv. Name the dataset `ces`.


```{r, echo = TRUE, message=FALSE, error = FALSE}
library(tidyverse)
library(dasehr)
```
```{r}
ces <- calenviroscreen
# Or use
# ces <- read_csv(file = "https://daseh.org/data/CalEnviroScreen_data.csv")
ces <- read_csv(file = "https://daseh.org/data/CalEnviroScreen_data.csv")
```

### 1.1

How observations/rows are in the `ces` data set? You can use `dim()` or `nrow()` or examine the Environment.
How many observations/rows are in the `ces` data set? You can use `dim()` or `nrow()` or examine the Environment.

```{r 1.1response}
```

### 1.2

What was the population of California in the 2010 census, based on the `TotalPop` column? (use `sum()`)
The `TotalPop` column includes information about the population for each census tract as of the 2010 census.

NOTE: A census tract a small, relatively permanent area within a county used to present data from the census. Each row in the `ces` dataset corresponds to a single census tract. See https://www2.census.gov/geo/pdfs/education/CensusTracts.pdf

What was the total population in the dataset based on the 2010 census? (use `sum()` and the `TotalPop` column)

```{r 1.2response}
```

### 1.3

What is the largest (`max`) total population (`TotalPop`) among all census tracts (rows)? Use `summarize`.
What was the largest population, according to the 2010 census, for a single census tract (row)? Use `summarize` and `max`.

```
# General format
Expand All @@ -63,7 +56,7 @@ DATA_TIBBLE %>%

### 1.4

Modify your code from 1.3 to add the `min` of `TotalPop` using the `summarize` function.
Modify your code from 1.3 to add the smallest population among census tracts. Use `min` in your `summarize` function.

```
# General format
Expand All @@ -82,7 +75,9 @@ DATA_TIBBLE %>%

### P.1

Summarize the `ces` data to get the mean of `TotalPop` and `Pesticides`. Make sure to remove `NA`s.
Summarize the `ces` data to get the mean of both the `TotalPop` and `Pesticides` columns. Make sure to remove `NA`s.

`Pesticides`: Total pounds of selected active pesticide ingredients used in production-agriculture per square mile. The higher the number, the greater the amount of pesticides have been used on agricultural sites

```
# General format
Expand All @@ -106,9 +101,7 @@ Given that parts of California are heavily agricultural, and the max value for t

### P.3

Filter any zeros out of `ces` `Pesticides`. Use `filter()`. Assign this "cleaned" dataset object the name `exurban_ces``.

(We are making the admittedly shaky assumption that places with no reported pesticide use are within cities.)
Filter any zeros from the `Pesticides` column out of `ces`. Use `filter()`. Assign this "cleaned" dataset object the name `ces_pest`.

```
# General format
Expand All @@ -130,23 +123,21 @@ How many census tracts have pesticide values greater than 0?

### 2.1

The variable `CES4.0PercRange` categorizes the calculated CES4.0 value (a measure of the pollution burden in a particular region) into percentile ranges, grouped by 5% increments.

How many census tracts are there in each percentile range? Use `count()` on the column named `CES4.0PercRange`. Use `ces` as your input data.
How many census tracts are present in each California county? Use `count()` on the column named `CaliforniaCounty`. Use `ces` as your input data.

```{r 2.1response}
```

### 2.2

Modify your code from question 2.1 to break down each percentile range by California county. Use `count()` on the columns named `CES4.0PercRange` and `CaliforniaCounty`.
Let's break down the count further. Modify your code from question 2.1 to count census tracts by County AND ZIP code. Use `count()` on the columns named `CaliforniaCounty` and `ZIP`.

```{r 2.2response}
```

Hmm. This isn't the easiest table to read. Let's try a different approach.
This isn't the only way we can create this table in R. Let's look at another way to build it.

### 2.3

Expand All @@ -165,7 +156,7 @@ DATA_TIBBLE %>%

### 2.4

Modify your code from 2.3 to also group by `CES4.0PercRange`.
Modify your code from 2.3 to also group by `ZIP`.

```{r 2.4response}
Expand All @@ -176,7 +167,7 @@ Modify your code from 2.3 to also group by `CES4.0PercRange`.

### P.4

Modify code from 2.3 to also summarize by total population per group. In your summarized output, make sure you call the new summarized average total population variable (column name) "mean". In other words, the head of your output should look like:
Modify code from 2.3 (the code that only groups by county) to also summarize by total population (`TotalPop`) per group. In your summarized output, make sure you call the new summarized average total population variable (column name) "mean". In other words, the head of your output should look like:

```
# A tibble: 58 × 3
Expand All @@ -185,6 +176,7 @@ Modify code from 2.3 to also summarize by total population per group. In your su
1 "Alameda " 360 4602.
...
```
(In the above table, remember that the "count" column is counting the number of census tracts.)

```{r P.4response}
Expand Down
554 changes: 165 additions & 389 deletions modules/Data_Summarization/lab/Data_Summarization_Lab_Key.html

Large diffs are not rendered by default.

Binary file modified modules/cheatsheets/Day-4.pdf
Binary file not shown.

0 comments on commit 6a36ef5

Please sign in to comment.