Lily McMullen Assignment 3.Rmd

---
title: "Module 1 Assignment 3: Getting to Know your Home"
author: "Ellen Bledsoe" 'Lily McMullen'
date: "`r Sys.Date()`"      # <- uses the current date when rendered
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

# Assignment Description

### Purpose

The goal of this assignment is to get comfortable using the `tidyverse` with 2-dimensional data sets and compare this process to using base R.

### Task

Write R code using the `tidyverse` to successfully answer each question below.

### Criteria for Success

-   Code is within the provided code chunks
-   Code is commented with brief descriptions of what the code does
-   Code chunks run without errors
-   Code produces the correct result
-   This is the one time I *will* take points off for not using `tidyverse`...

### Due Date

Sept 15 at midnight MDT

# Assignment Questions

For this final assignment for Module 1, you'll be working with another real-world data set--a collection of data from climate stations scattered across Antarctica.

1.  In your own words, describe what the `tidyverse` is. Your answer should be between 1-3 sentences.

Tidyverse is a library in R that allows us to handle data more easily. The tidyverse library contains many packages that were designed to make data science in R easier.

1.  Load in the `tidyverse` package.

```{r load_tidyverse}
library(tidyverse)
```

3.  Load in the data file (called aggregated_station_data.csv). Save the data as an object called `weather`.

```{r load_data}
weather <- aggregated_station_data
```

4.  Take a look at the data in whichever way you would like (e.g., `glimpse()`, `slice()`, `str()`, `head()`, etc.). How many rows and columns are in the data? Type your answers below:

    rows:139,160\
    columns:12

5.  Create a data frame that includes temperatures which are above freezing (AKA greater than 0)

```{r above0}
weather %>%
  filter(temp > 0)
```

6.  Create a new data frame that includes *only* the following columns: year, day, month, temp, station_id. Save this new data frame as an object called `temp`.

```{r temp_df}
temp <- weather[, c("year", "day", "month", "temp", "station_id")]
```

7.  Using the data frame you created in Q5 above (`temp`), add a new column to that data frame that converts the temperature column (currently in Celsius) to Fahrenheit. Call the new column `tempF`. (Hint: we did this in class--use that same equation)

```{r tempF}
temp %>%
  mutate(tempF = temp * (9/5) + 32)
```

8.  In your own words (either bullet points or sentence form is fine), explain two benefits of using the pipe (`%>%`).

    Using the pipe is, in my opinion, the easiest way to forward a value into the next thing you're doing. The pipe mostly helps you connect your commands.

9.  Find the minimum temperature recorded for each month (in Celsius, the original column). (Hint: think about months first (split) and then temperature (apply). You will also want to remove all the NA values.)

```{r}
weather %>%
  group_by(month) %>%
  summarize(min_temp = min(temp, na.rm = TRUE))
```

10. Create a data frame with the mean temperature for the month of January for each station.

    Some hints:

    -   take note of how months are represented in the data

    -   think about using the pipe, how we choose which rows we want, and how we split-apply-combine

    -   remember to remove the NA values!

```{r mean_jan_temp}
weather %>%
  filter(month == 1) %>%
  group_by(station_id) %>%
  summarize(mean_temp = mean(temp, na.rm = TRUE))
  
```

## Bonus! (up to 2 points)

Write code to determine how many unique stations are in the `weather` data set. (Hint: look up the help file for the `distinct()` and the `count()` functions).

```{r unique_stations}

```