-
Notifications
You must be signed in to change notification settings - Fork 19
/
README.Rmd
187 lines (114 loc) · 5.96 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
---
output: github_document
always_allow_html: true
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%",
warning = FALSE,
message = FALSE
)
library(dplyr)
library(gtsummary)
library(gnomeR)
```
# gnomeR
<!-- badges: start -->
[![R-CMD-check](https://github.com/MSKCC-Epi-Bio/gnomeR/workflows/R-CMD-check/badge.svg)](https://github.com/MSKCC-Epi-Bio/gnomeR/actions)
[![Codecov test coverage](https://codecov.io/gh/MSKCC-Epi-Bio/gnomeR/branch/main/graph/badge.svg)](https://app.codecov.io/gh/MSKCC-Epi-Bio/gnomeR?branch=main)
<!-- badges: end -->
## Installation
You can install the development version of `gnomeR` from [GitHub](https://github.com/) with:
``` r
# install.packages("devtools")
devtools::install_github("MSKCC-Epi-Bio/gnomeR")
```
Along with its companion package for cbioPortal data download:
``` r
devtools::install_github("karissawhiting/cbioportalR")
```
## Introduction
the `gnomeR` package provides a consistent framework for genetic data processing, visualization and analysis. This is primarily targeted to IMPACT datasets but can also be applied to any genomic data provided by cBioPortal. With {gnomeR} and {cbioportalR} you can:
- [**Download and gather data from CbioPortal**](https://www.karissawhiting.com/cbioportalR/) - Pull from cBioPortal data base by study ID or sample ID.
- **OncoKB annotate data (coming soon)** - Filter genomic data for known oncogenic alterations.
- **Process genomic data** - Process retrieved mutation/maf, fusions, copy-number alteration, and segmentation data (when available) into an analysis-ready formats.
- **Visualize processed data** - Create summary plots from processed data.
- **Analyzing processed data**- Analyze associations between genomic variables and clinical variables or outcomes.
{gnomeR} is part of [gnomeverse](https://mskcc-epi-bio.github.io/genomeverse/), a collection of R packages designed to work together seamlessly to create reproducible clinico-genomic analysis pipelines.
## Getting Set up
{gnomeR} works with any genomic data that follows cBioPortal guidelines for [mutation](https://docs.cbioportal.org/5.1-data-loading/data-loading/file-formats#data-file-5), [CNA](https://docs.cbioportal.org/5.1-data-loading/data-loading/file-formats#discrete-copy-number-data), or [fusion](https://docs.cbioportal.org/5.1-data-loading/data-loading/file-formats#structural-variant-data) data file formats.
If you wish to pull the data directly from cBioPortal, see how to get set up with credentials with the [{cbioportalR}](https://karissawhiting.github.io/cbioportalR/) package.
## Processing Genomic Data
The below examples uses the data sets `mutatations`, `sv`, `cna` which were pulled from cBioPortal and are included in the package as example data sets. We will sample 100 samples for examples:
```{r}
set.seed(123)
mut <- gnomeR::mutations
cna <- gnomeR::cna
sv <- gnomeR::sv
un <- unique(mut$sampleId)
sample_patients <- sample(un, size = 50, replace = FALSE)
```
The main data processing function is `create_gene_binary()` which takes mutation, CNA and fusion files as input, and outputs a binary matrix of N rows (number of samples) by M genes included in the data set. We can specify which patients are included which will force all patients in resulting dataframe, even if they have no alterations.
```{r }
gen_dat <- create_gene_binary(samples = sample_patients,
mutation = mut,
fusion = sv,
cna = cna)
head(gen_dat[, 1:6])
```
By default, mutations, CNA and fusions will be returned in separate columns. You can combine these at the gene level using the following:
```{r}
by_gene <- gen_dat %>%
summarize_by_gene()
head(by_gene[,1:6])
```
## Visualize
You can visualize your processed and raw alteration data sets using {gnomeR}'s many data visualization functions.
Quickly visualize mutation characteristics with `ggvarclass()`,
`ggvartype()`, `ggsnvclass()`, `ggsamplevar()`, `ggtopgenes()`, `gggenecor()`, and `ggcomut()`.
```{r}
ggvarclass(mutation = mut)
```
## Summarize & Analyze
You can tabulate summarize your genomic data frame using the `tbl_genomic()` function, a wrapper for `gtsummary::tbl_summary()`.
```{r }
gen_dat <- gen_dat %>%
dplyr::mutate(trt_status = sample(x = c("pre-trt", "post-trt"),
size = nrow(gen_dat), replace = TRUE))
```
```{r, tbl_genomic }
gene_tbl_trt <- gen_dat %>%
subset_by_frequency(t = .1, other_vars = trt_status) %>%
tbl_genomic(by = trt_status) %>%
gtsummary::add_p()
```
```{r tbl_genomic_print, include = FALSE}
#gt::gtsave(as_gt(gene_tbl_trt), file = file.path(tempdir(), "temp.png"))
gt::gtsave(as_gt(gene_tbl_trt), filename = here::here("man", "figures" , "README-tbl_genomic_print.png"))
```
```{r out.width = "30%", echo = FALSE}
knitr::include_graphics(here::here("man/figures/README-tbl_genomic_print.png"))
```
Additionally, you can analyze custom pathways, or a set of default gene pathways using `add_pathways()`:
```{r}
path_by_trt <- gen_dat %>%
add_pathways() %>%
select(sample_id, trt_status, contains("pathway_")) %>%
tbl_genomic(by = trt_status) %>%
gtsummary::add_p()
```
```{r tbl_genomic_pathway, include = FALSE}
gt::gtsave(as_gt(path_by_trt), filename = here::here("man", "figures" , "README-path_by_trt.png"))
```
```{r echo=FALSE, out.width="30%"}
knitr::include_graphics(here::here("man/figures/README-path_by_trt.png"))
```
# Contributing
Please note that the gnomeR project is released with a [Contributor Code of Conduct](https://contributor-covenant.org/version/2/0/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms.
Thank you to all contributors!
`r usethis::use_tidy_thanks("MSKCC-Epi-Bio/gnomeR", from = "2020-01-01") %>% {glue::glue("[@{.}](https://github.com/{.})")} %>% glue::glue_collapse(sep = ", ", last = ", and ")`
# The End