Skip to content

Commit

Permalink
Merge branch 'master' of https://github.com/nbisweden/workshop-r
Browse files Browse the repository at this point in the history
  • Loading branch information
lokeshbio committed Oct 17, 2024
2 parents 220bd92 + 3dcfb98 commit 157f817
Show file tree
Hide file tree
Showing 20 changed files with 1,318 additions and 284 deletions.
32 changes: 25 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,9 +37,33 @@ If you are using command line, you can install `vd` and open and edit the file `

### Docker

A docker container is used in GitHub actions to build the website. The Dockerfile contains the image definition. To update the docker image, follow the steps below:
R packages needed to build the website and run the labs are all contained in a Docker container. To run docker container locally, follow instructions below:

:exclamation: Image is about 4.8 GB!

```
# pull the container
docker pull --platform=linux/amd64 ghcr.io/nbisweden/workshop-r:latest
# render whole website
docker run --platform=linux/amd64 --rm -u $(id -u ${USER}):$(id -g ${USER}) -v ${PWD}:/rmd ghcr.io/nbisweden/workshop-r:latest Rscript -e 'rmarkdown::render_site()'
# render one file
docker run --platform=linux/amd64 --rm -u $(id -u ${USER}):$(id -g ${USER}) -v ${PWD}:/rmd ghcr.io/nbisweden/workshop-r:latest Rscript -e 'rmarkdown::render("index.Rmd")'
```

To run RStudio server and develop in the browser, run;

```
docker run --platform=linux/amd64 --rm -e PASSWORD=rstudio -p 8788:8787 -v ${PWD}:/rmd ghcr.io/nbisweden/workshop-r:latest
```

Go to [http://localhost:8788/](http://localhost:8788/) or [http://0.0.0.0:8788](http://0.0.0.0:8788). Username is `rstudio` and password is `rstudio`. Change to folder `/rmd` to see your files.

To add new packages, you need to update `Dockerfile`, rebuild the container, test it and push it to repository. Make changes to the `Dockerfile` as needed. Then to rebuild and push the docker image, follow the steps below:

:exclamation: Remember to update the version number
:exclamation: Remember to render the whole website to make sure everything works

```
# build container and add tags
Expand All @@ -50,12 +74,6 @@ docker tag ghcr.io/nbisweden/workshop-r:1.1.0 ghcr.io/nbisweden/workshop-r:lates
docker login ghcr.io
docker push ghcr.io/nbisweden/workshop-r:1.1.0
docker push ghcr.io/nbisweden/workshop-r:latest
# run container locally
# render whole website
docker run --platform=linux/amd64 --rm -u $(id -u ${USER}):$(id -g ${USER}) -v ${PWD}:/rmd ghcr.io/nbisweden/workshop-r:latest Rscript -e 'rmarkdown::render_site()'
# render one file
docker run --platform=linux/amd64 --rm -u $(id -u ${USER}):$(id -g ${USER}) -v ${PWD}:/rmd ghcr.io/nbisweden/workshop-r:latest Rscript -e 'rmarkdown::render("index.Rmd")'
```

---
Expand Down
2 changes: 2 additions & 0 deletions _site.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,4 +37,6 @@ navbar:
href: home_precourse.html
- text: Info
href: home_info.html
- text: Projects
href: home_projects.html

Binary file added data/slide_intro/num_pkgs.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/slide_programming/Data_classification.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/slide_r_environment/ggplot2_CRAN.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
14 changes: 8 additions & 6 deletions home_content.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -30,23 +30,25 @@ This page contains links to different lectures (slides) and practical exercises
* [Intro to R (Slides)](slide_r_intro.html)
* [Intro to R environment (Slides)](slide_r_environment.html)
* [Intro to programming in R (Slides)](slide_r_programming_1.html)
* [Variables and Operators (Slides)](slide_elements_1.pdf)
* [Variables and Operators (Slides)](slide_r_elements_1.html)
* [Data types (Lab)](lab_datatypes.html)
* [Vectors and Strings (Slides)](slide_elements_2.pdf)
* [Matrices, Lists and Dataframes (Slides)](slide_elements_3.pdf)
* [Vectors and Strings (Slides)](slide_r_elements_2.html)
* [Matrices, Lists and Dataframes (Slides)](slide_r_elements_3.html)
* [Working with Vectors (Lab)](lab_vectors.html)
* [Dataframes (Lab)](lab_dataframes.html)
* [Loops and functions (Slides)](slide_r_elements_4.html)
* [Loops and functions (Lab)](lab_loops.html)

**Data wrangling**

* [Loading data (Slides)](slide_loadingdata.pdf)
* [Loading data (Slides)](slide_loading_data.html)
* [Loading data (Lab)](lab_loadingdata.html)
* [Tidyverse (Slides)](slide_tidyverse.html)
* [Tidyverse (Lab)](lab_tidyverse.html)

**Graphics**

* [Graphics with base R (Slides)](slide_graphics.pdf)
* [Graphics with base R (Slides)](slide_base_graphics.html)
* [Graphics with base R (Lab)](lab_graphics.html)
* [Graphics with ggplot2 (Slides)](slide_ggplot2.html)
* [Working with ggplot2 (Lab)](lab_ggplot2.html)
Expand All @@ -58,7 +60,7 @@ This page contains links to different lectures (slides) and practical exercises
**Useful resources**

* [Data structures in R](data/common/R_data_structures_ver_1_1.pdf)
* [Color names in R](data/common/Rolor.pdf)
* [Color names in R](data/common/Rcolor.pdf)
* [Visualising data](data/common/rules_for_using_color.pdf)
* [Naming conventions in R](data/common/Rnaming.pdf)
* [Introduction to statistical tests in R](data/common/stats_tests.pdf)
Expand Down
25 changes: 15 additions & 10 deletions home_precourse.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -65,21 +65,26 @@ RStudio provides you with tools like code editor with highlighting, project mana

Extra R packages used in the workshop exercises (if any) are listed below. It is recommended that you install this in advance. Simply copy and paste the code into R.

```{r,eval=TRUE,chunk.title=NULL,echo=FALSE,comment="",class.output="r"}
# this code block reads package names from '_site.yml' and prints them as installation instruction.
```{r include=FALSE}
# this first chunk runs through the root directory, finds the installed packages across the files and prints them as installation instruction.
pkg <- yaml::read_yaml("_site.yml")
#Add to the pkg_discard object the packages you want to discard from the list
pkg<-unique(renv::dependencies()$Package)
pkg_discard<-c("mkteachr", "manipulateWidget")
pkg_list<-pkg[!pkg %in% pkg_discard]
```

```{r echo=FALSE, warning=FALSE, chunk.title=NULL, class.output="r", comment="", r,eval=TRUE}
if(!is.null(pkg$packages$packages_cran_student)) {
cat("# install from cran\n")
cat(paste0("install.packages(c('",paste(pkg$packages$packages_cran_student,sep="",collapse="','"),"'))"))
cat(paste0("install.packages(c('",paste(pkg_list,sep="",collapse="','"),"'))"))
cat("\n")
}
if(!is.null(pkg$packages$packages_bioc_student)) {
cat("# install from bioconductor\n")
cat(paste0("BiocManager::install(c('",paste(pkg$packages$packages_bioc_student,sep="",collapse="','"),"'))"))
}
```

`r fa1("chevron-circle-right")` &nbsp; **Install Docker**
Expand Down
224 changes: 224 additions & 0 deletions home_projects.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,224 @@
---
title: "Projects"
output:
bookdown::html_document2:
highlight: textmate
toc: false
toc_float:
collapsed: true
smooth_scroll: true
print: false
toc_depth: 4
number_sections: false
df_print: default
code_folding: none
self_contained: false
keep_md: false
encoding: 'UTF-8'
css: "assets/lab.css"
include:
after_body: assets/footer-lab.html
---

```{r,child="assets/header-lab.Rmd"}
```

Hands-on analysis of actual data is the best way to learn R programming. This page contains some data sets that you can use to explore what you have learned in this course. For each data set, a brief description as well as download instructions are provided.

<div class="alert alert-info">
<strong> Try to focus on using the tools from the course to explore the data, rather than worrying about producing a perfect report with a coherent analysis workflow.</strong>
</div>


On the last day you will present your Rmd file (or rather, the resulting html report) and share with the class what your data was about.

---

## Palmer penguins 🐧

- This is a data set containing a series of measurements for three species of penguins collected in the Palmer station in Antarctica.
- Data description: <https://vincentarelbundock.github.io/Rdatasets/doc/heplots/peng.html>

<details>
<summary>Download instructions</summary>
```{r, warning=F, message=F}
penguins <- read.table("https://vincentarelbundock.github.io/Rdatasets/csv/heplots/peng.csv", header = T, sep = ",")
str(penguins)
```
</details>

---

## Drinking habits 🍷

- Data from a national survey on the drinking habits of american citizens in 2001 and 2002.
- Data description: <https://vincentarelbundock.github.io/Rdatasets/doc/stevedata/nesarc_drinkspd.html>

<details>
<summary>Download instructions</summary>
```{r}
library(dplyr)
# this will download the csv file directly from the web
drinks <- read.table("https://vincentarelbundock.github.io/Rdatasets/csv/stevedata/nesarc_drinkspd.csv", header = T, sep = ",")
# the lines below will take a sample from the full data set
set.seed(seed = 2)
drinks <- sample_n(drinks, size = 3000, replace = F)
# and here we check the structure of the data
str(drinks)
```
</details>

---

## Car crashes 🚗

- Data from car accidents in the US between 1997-2002.
- Data description: <https://vincentarelbundock.github.io/Rdatasets/doc/DAAG/nassCDS.html>

<details>
<summary>Download instructions</summary>
```{r}
library(dplyr)
# this will download the csv file directly from the web
crashes <- read.table("https://vincentarelbundock.github.io/Rdatasets/csv/DAAG/nassCDS.csv", header = T, sep = ",")
# the lines below will take a sample from the full data set
set.seed(seed = 2)
crashes <- sample_n(crashes, size = 3000, replace = F)
# and here we check the structure of the data
str(crashes)
```
</details>

---

## Gapminder health and wealth 📈

- This is a collection of country indicators from the Gapminder dataset for the years 2000-2016.
- Data description: <https://vincentarelbundock.github.io/Rdatasets/doc/dslabs/gapminder.html>

<details>
<summary>Download instructions</summary>
```{r}
library(dplyr)
# this will download the csv file directly from the web
gapminder <- read.table("https://vincentarelbundock.github.io/Rdatasets/csv/dslabs/gapminder.csv", header = T, sep = ",")
# here we filter the data to remove anything before the year 2000
gapminder <- gapminder |> filter(year >= 2000)
# and here we check the structure of the data
str(gapminder)
```
</details>

---

## StackOverflow survey 🖥️

- This is a downsampled and modified version of one of StackOverflow's annual surveys where users respond to a series of questions related to careers in technology and coding.
- Data description: <https://vincentarelbundock.github.io/Rdatasets/doc/modeldata/stackoverflow.html>

<details>
<summary>Download instructions</summary>
```{r}
library(dplyr)
# this will download the csv file directly from the web
stackoverflow <- read.table("https://vincentarelbundock.github.io/Rdatasets/csv/modeldata/stackoverflow.csv", header = T, sep = ",")
# the lines below will take a sample from the full data set
set.seed(2)
stackoverflow <- sample_n(stackoverflow, size = 3000)
# and here we check the structure of the data
str(stackoverflow)
```
</details>

---

## Doctor visits 🤒

- Data on the frequency of doctor visits in the past two weeks in Australia for the years 1977 and 1978.
- Data description: <https://vincentarelbundock.github.io/Rdatasets/doc/AER/DoctorVisits.html>

<details>
<summary>Download instructions</summary>
```{r}
library(dplyr)
# this will download the csv file directly from the web
doctor <- read.table("https://vincentarelbundock.github.io/Rdatasets/csv/AER/DoctorVisits.csv", header = T, sep = ",")
# the lines below will take a sample from the full data set
set.seed(2)
doctor <- sample_n(doctor, size = 3000)
# and here we check the structure of the data
str(doctor)
```
</details>

---

## Video Game Sales 🎮

- This data set contains sales figures for video games titles released in 2001 and 2002.
- Data description: <https://mavenanalytics.io/data-playground?order=date_added%2Cdesc&search=Video%20Game%20Sales>
- Click on "Preview Data" and "VG Data Dictionary" to see the description for each column.

<details>
<summary>Download instructions</summary>
```{r, warning=F, message=F}
library(dplyr)
library(lubridate)
# this will download the file to your working directory
download.file(url = "https://maven-datasets.s3.amazonaws.com/Video+Game+Sales/Video+Game+Sales.zip", destfile = "video_game_sales.zip")
# this will unzip the file and read it into R
videogames <- read.table(unz(filename = "vgchartz-2024.csv", "video_game_sales.zip"), header = T, sep = ",", quote = "\"", fill = T)
# this will select rows corresponding to years 2001 and 2002
videogames <- filter(videogames, year(as_date(release_date)) %in% c(2001,2002))
# and here we check the structure of the data
str(videogames)
```
</details>

---

## LEGO Sets 🏗️

- This data set contains the description of all LEGO sets released from 2000 to 2009.
- Data description: <https://mavenanalytics.io/data-playground?order=date_added%2Cdesc&search=lego>
- Click on "Preview Data" and "VG Data Dictionary" to see the description for each column.

<details>
<summary>Download instructions</summary>
```{r, warning=F, message=F}
library(dplyr)
# this will download the file to your working directory
download.file(url = "https://maven-datasets.s3.amazonaws.com/LEGO+Sets/LEGO+Sets.zip", destfile = "lego.csv.zip")
# this will unzip the file and read it into R
lego <- read.table(unz(filename = "lego_sets.csv", "lego.csv.zip"), header = T, sep = ",", quote = "\"", fill = T)
# this will select rows corresponding to years 2000-2009
lego <- filter(lego, year %in% seq(2000,2009,1))
# and here we check the structure of the data
str(lego)
```
</details>

---

## Shark attacks 🦈

- This data set contains information on shark attack records from all over the world.
- Data description: <https://mavenanalytics.io/data-playground?order=date_added%2Cdesc&search=shark>
- Click on "Preview Data" and "VG Data Dictionary" to see the description for each column.

<details>
<summary>Download instructions</summary>
```{r, warning=F, message=F}
library(dplyr)
# this will download the file to your working directory
download.file(url = "https://maven-datasets.s3.amazonaws.com/Shark+Attacks/attacks.csv.zip", destfile = "attacks.csv.zip")
# this will unzip the file and read it into R
sharks <- read.table(unz(filename = "attacks.csv", "attacks.csv.zip"), header = T, sep = ",", quote = "\"", fill = T)
# the lines below will take a sample from the full data set
set.seed(seed = 2)
sharks <- sample_n(sharks, size = 3000, replace = F)
str(sharks)
```
</details>

***
Binary file added images/data_frame.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified images/data_structures.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 5 additions & 0 deletions lab_graphics.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -593,3 +593,8 @@ You task here is to use the already acquired R knowledge to plot an interesting
- Be creative,
- Visualize a selected variables using boxplot and histogram on one plot (HINT: parameter mfrow),
- Discuss the result with your colleagues and TAs.

```{r}
unlink(local_file_path)
```

Loading

0 comments on commit 157f817

Please sign in to comment.