generated from jhudsl/AnVIL_Template
-
Notifications
You must be signed in to change notification settings - Fork 0
/
04-starting-rstudio.Rmd
229 lines (153 loc) · 14 KB
/
04-starting-rstudio.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
```{r, include = FALSE}
ottrpal::set_knitr_image_path()
```
# (PART\*) RSTUDIO ON ANVIL {-}
If you are brand-new to AnVIL, make sure to take a look at the AnVIL Guide on [Getting Started](https://jhudatascience.org/AnVIL_Book_Getting_Started) to learn how to set up an account and create a workspace before reading any farther.
Once you have created a workspace, you can create an RStudio cloud environment. This will allow you to interface with data and perform genomics-based analyses with add on packages from the Bioconductor community.
## Launch RStudio Cloud Environment
1. Click on the name of the Workspace you just created. You should be routed to a link that looks like: `https://anvil.terra.bio/#workspaces/<billing-project>/<workspace-name>`.
1. On the top right, Click the gear icon to access your Cloud Environment options.
```{r, echo=FALSE, fig.alt='Screenshot of the newly created Workspace. The gear icon to create a new cloud environment is highlighted.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1eypYLLqD11-NwHLs4adGpcuSB07dYEJfAaALSMvgzqw/edit#slide=id.gdde5ec9a4d_1_34")
```
1. You will see a list of costs because it costs a small amount of money to use cloud computing. Click "CUSTOMIZE".
```{r, echo=FALSE, fig.alt='Screenshot of the cloud environment popout menu. The "Customize" button is highlighted.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1eypYLLqD11-NwHLs4adGpcuSB07dYEJfAaALSMvgzqw/edit#slide=id.gdde5ec9a4d_1_50")
```
1. Click on the first drop down menu to see what other software configurations are available.
```{r, echo=FALSE, fig.alt='Screenshot of the cloud environment popout menu. The first dropdown menu for options, the Application configuration menu, is highlighted.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1eypYLLqD11-NwHLs4adGpcuSB07dYEJfAaALSMvgzqw/edit#slide=id.gdde5ec9a4d_1_11")
```
1. Scroll down and select RStudio from the Community-Maintained RStudio Environments section. **NOTE**: AnVIL is very versatile and can scale up to use very powerful cloud computers. It's very important that you select the cloud computing environment described here to avoid runaway costs.
```{r, echo=FALSE, fig.alt='Screenshot of the Application configuration menu. The community maintained RStudio environment is highlighted.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1eypYLLqD11-NwHLs4adGpcuSB07dYEJfAaALSMvgzqw/edit#slide=id.ge08067d6e2_0_0")
```
1. Leave everything else as-is. To create your RStudio Cloud Environment, click on the “CREATE” button.
```{r, echo=FALSE, fig.alt='Screenshot of the Application configuration menu. The "Create" button is highlighted.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1eypYLLqD11-NwHLs4adGpcuSB07dYEJfAaALSMvgzqw/edit#slide=id.ge08067d6e2_0_34")
```
1. Your Cloud Environment will be available in a few minutes after the cloud resources are provisioned and your software starts up. The upper right corner displays the status and should say “Creating” while resources are being provisioned.
```{r, echo=FALSE, fig.alt='Screenshot of the Workspace page. A cloud environment for RStudio is being created. The loading icon on the top right of the page is highlighted.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1eypYLLqD11-NwHLs4adGpcuSB07dYEJfAaALSMvgzqw/edit#slide=id.ge08067d6e2_0_6")
```
1. After a few minutes, you will see the status change to “Running”.
```{r, echo=FALSE, fig.alt='Screenshot of the Workspace page. A cloud environment for RStudio has been created. The icon on the top right showing that the cloud environment is running is highlighted.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1eypYLLqD11-NwHLs4adGpcuSB07dYEJfAaALSMvgzqw/edit#slide=id.ge08067d6e2_0_10")
```
1. Click on the “R” icon to launch RStudio.
```{r, echo=FALSE, fig.alt='Screenshot of the Workspace page. A cloud environment for RStudio has been created. The R button that launches the RStudio interface is highlighted.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1eypYLLqD11-NwHLs4adGpcuSB07dYEJfAaALSMvgzqw/edit#slide=id.ge08067d6e2_0_43")
```
1. You should now see the RStudio interface with information about the version printed to the console.
```{r, echo=FALSE, fig.alt='Screenshot of the RStudio environment interface.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1eypYLLqD11-NwHLs4adGpcuSB07dYEJfAaALSMvgzqw/edit#slide=id.ge08067d6e2_0_14")
```
## Tour RStudio
Next, we will be using RStudio and the package `Glimma` to create interactive plots. See [this vignette](https://bioconductor.org/packages/release/bioc/vignettes/Glimma/inst/doc/limma_edger.html) for more information.
1. The Bioconductor team has created a very useful package to programmatically interact with Terra and Google Cloud. Install the `AnVIL` package. It will make some steps easier as we go along.
```{r, eval = FALSE}
BiocManager::install("AnVIL")
```
```{r, echo=FALSE, fig.alt='Screenshot of the RStudio environment interface. Code has been typed in the console and is highlighted.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1BLTCaogA04bbeSD1tR1Wt-mVceQA6FHXa8FmFzIARrg/edit#slide=id.g11f12bc99af_0_49")
```
1. You can now quickly install precompiled binaries using the AnVIL package’s `install()` function. We will use it to install the `Glimma` package and the `airway` package. The `airway` package contains a `SummarizedExperiment` data class. This data describes an RNA-Seq experiment on four human airway smooth muscle cell lines treated with dexamethasone. We will learn more about SummarizedExperiments in [following chapters](the-summarizedexperiment-class.html#summarizedexperiment).
```{r, eval = FALSE}
AnVIL::install(c("Glimma", "airway"))
```
```{r, echo=FALSE, fig.alt='Screenshot of the RStudio environment interface. Code has been typed in the console and is highlighted.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1BLTCaogA04bbeSD1tR1Wt-mVceQA6FHXa8FmFzIARrg/edit#slide=id.g11f12bc99af_0_56")
```
1. Load the example data.
```{r, eval = FALSE}
library(airway)
data(airway)
```
```{r, echo=FALSE, fig.alt='Screenshot of the RStudio environment interface. Code has been typed in the console and is highlighted.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1BLTCaogA04bbeSD1tR1Wt-mVceQA6FHXa8FmFzIARrg/edit#slide=id.g11f12bc99af_0_56")
```
1. The multidimensional scaling (MDS) plot is frequently used to explore differences in samples. When this data is MDS transformed, the first two dimensions explain the greatest variance between samples, and the amount of variance decreases monotonically with increasing dimension. The following code will launch a new window where you can interact with the MDS plot.
```{r, eval = FALSE}
Glimma::glimmaMDS(assay(airway), group = colData(airway)$dex)
```
```{r, echo=FALSE, fig.alt='Screenshot of the Glimma popout showing the data in an MDS plot. All data points are blue.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1BLTCaogA04bbeSD1tR1Wt-mVceQA6FHXa8FmFzIARrg/edit#slide=id.g11f12bc99af_0_70")
```
1. Change the `colour_by` setting to "groups" so you can easily distinguish between groups. In this data, the "group" is the treatment.
```{r, echo=FALSE, fig.alt='Screenshot of the Glimma popout showing the data in an MDS plot. Data points are colored blue and orange by group. The colour by dropdown menu on the interactive plot is hightlighted.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1BLTCaogA04bbeSD1tR1Wt-mVceQA6FHXa8FmFzIARrg/edit#slide=id.g11f12bc99af_0_77")
```
1. You can download the interactive html file by clicking on "Save As".
```{r, echo=FALSE, fig.alt='Screenshot of the Glimma popout showing the data in an MDS plot. The Save As menu is highlighted.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1BLTCaogA04bbeSD1tR1Wt-mVceQA6FHXa8FmFzIARrg/edit#slide=id.g1204ed6da7f_0_0")
```
1. You can also download plots and other files created directly in RStudio. To download the following plot, click on "Export" and save in your preferred format to the default directory. This saves the file in your cloud environment.
```{r, eval = FALSE}
limma::plotMDS(airway)
```
```{r, echo=FALSE, fig.alt='Screenshot of the RStudio interface. A plot has been created. The Export menu has been highlighted.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1BLTCaogA04bbeSD1tR1Wt-mVceQA6FHXa8FmFzIARrg/edit#slide=id.g1204ed6da7f_0_12")
```
1. You should see the plot in the "Files" pane.
```{r, echo=FALSE, fig.alt='Screenshot of the RStudio interface. A plot has been created. The saved pdf file is now visible under the "Files" pane.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1BLTCaogA04bbeSD1tR1Wt-mVceQA6FHXa8FmFzIARrg/edit#slide=id.g1204ed6da7f_0_19")
```
1. Select this file and click "More" > "Export"
```{r, echo=FALSE, fig.alt='Screenshot of the RStudio interface. A plot has been created. The saved pdf file is now visible under the "Files" pane. The "More" and "Export" menus have been highlighted.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1BLTCaogA04bbeSD1tR1Wt-mVceQA6FHXa8FmFzIARrg/edit#slide=id.g1204ed6db6a_0_0")
```
1. Select "Download" to save the file to your local machine.
```{r, echo=FALSE, fig.alt='Screenshot of the RStudio interface. The popup to download the selected file has been highlighted,'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1BLTCaogA04bbeSD1tR1Wt-mVceQA6FHXa8FmFzIARrg/edit#slide=id.g1204ed6db6a_0_8")
```
## More Practice with `iSEE`
`iSEE` is a Bioconductor package that provides an interactive Shiny-based graphical user interface for exploring data stored in SummarizedExperiment objects [@RueAlbrecht2018]. Run the following.
```{r, eval = FALSE}
# Install iSEE
AnVIL::install("iSEE")
# Launch app on airway data
iSEE::iSEE(airway)
```
```{r, echo=FALSE, fig.alt='Screenshot of the RStudio environment interface. Code has been typed in the console and is highlighted.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1BLTCaogA04bbeSD1tR1Wt-mVceQA6FHXa8FmFzIARrg/edit#slide=id.g11f12bc99af_0_56")
```
The Shiny app will allow you to explore genes and samples.
```{r, echo=FALSE, fig.alt='Screenshot of the iSEE popout. The gene related windows are shown.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1BLTCaogA04bbeSD1tR1Wt-mVceQA6FHXa8FmFzIARrg/edit#slide=id.g1204ed6db6a_1_12")
```
```{r, echo=FALSE, fig.alt='Screenshot of the iSEE popout. The sample related windows are shown.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1BLTCaogA04bbeSD1tR1Wt-mVceQA6FHXa8FmFzIARrg/edit#slide=id.g1204ed6db6a_1_19")
```
## Pause RStudio {#stopping}
1. The upper right corner reminds you that you are accruing cloud computing costs.
```{r, echo=FALSE, fig.alt='Screenshot of the RStudio interface. The icon on the top right showing that the cloud environment is running is highlighted.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1BLTCaogA04bbeSD1tR1Wt-mVceQA6FHXa8FmFzIARrg/edit#slide=id.g11f12bc99af_0_84")
```
1. You should minimize charges when you are not performing an analysis. You can do this by clicking on “Stop cloud environment”. This will release the CPU and memory resources for other people to use. Note that your work will be saved in the environment and continue to accrue a very small cost. Your instructor can delete these environments to stop costs accruing, so it's a good idea to save code or output somewhere else, such as GitHub or your local machine.
```{r, echo=FALSE, fig.alt='Screenshot of the RStudio interface. The stop icon on the top right which stops the cloud environment is highlighted.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1BLTCaogA04bbeSD1tR1Wt-mVceQA6FHXa8FmFzIARrg/edit#slide=id.g11f12bc99af_0_91")
```
## Delete RStudio Cloud Environment
1. [Stopping](#stopping) your cloud environment only pauses your work. When you are ready to delete the cloud environment, click on the gear icon in the upper right corner to “Update cloud environment”.
```{r, echo=FALSE, fig.alt='Screenshot of the Workspace page. The gear icon on the top right that updates the cloud environment is highlighted.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1eypYLLqD11-NwHLs4adGpcuSB07dYEJfAaALSMvgzqw/edit#slide=id.ge1182913a6_0_41")
```
1. Click on “Delete Environment Options”.
```{r, echo=FALSE, fig.alt='Screenshot of the cloud environment popout. "Delete environment options" is highlighted.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1eypYLLqD11-NwHLs4adGpcuSB07dYEJfAaALSMvgzqw/edit#slide=id.ge1182913a6_0_20")
```
1. If you are certain that you do not need the data and configuration on your disk, you should select "Delete everything, including persistent disk".
```{r, echo=FALSE, fig.alt='Screenshot of the cloud environment popout. "Delete everything, including persistent disk" is highlighted.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1eypYLLqD11-NwHLs4adGpcuSB07dYEJfAaALSMvgzqw/edit#slide=id.ge1182913a6_0_46")
```
1. Select "DELETE".
```{r, echo=FALSE, fig.alt='Screenshot of the cloud environment popout. "Delete" is highlighted.'}
ottrpal::include_slide("https://docs.google.com/presentation/d/1eypYLLqD11-NwHLs4adGpcuSB07dYEJfAaALSMvgzqw/edit#slide=id.ge1182913a6_0_51")
```
## Video Guide
In addition to the steps above, you can review this video guide on how to launch RStudio on AnVIL.
<iframe src="https://drive.google.com/file/d/1v72ZG8JIRDUaewFQgGfcCO_qoM4eYmYX/preview" width="640" height="360" allow="autoplay"></iframe>
The slides for this tutorial are are located [here](https://docs.google.com/presentation/d/1eypYLLqD11-NwHLs4adGpcuSB07dYEJfAaALSMvgzqw).
```{r}
sessionInfo()
```