Skip to content

Commit

Permalink
Merge pull request #112 from nf-core/dev
Browse files Browse the repository at this point in the history
Dev -> master for v1.2.0
  • Loading branch information
Jonathan Manning authored Apr 19, 2023
2 parents 47e3d92 + f7ec824 commit 3a849f0
Show file tree
Hide file tree
Showing 37 changed files with 828 additions and 225 deletions.
14 changes: 14 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,20 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## v1.2.0 - 2023-04-19

### `Added`

- [[#97](https://github.com/nf-core/differentialabundance/issues/97)] - Allow for subsetting of samples for specific contrasts ([@pinin4fjords](https://github.com/pinin4fjords), reported by [@danhalligan-hx](https://github.com/danhalligan-hx), review by [@WackerO](https://github.com/WackerO))
- [[#105](https://github.com/nf-core/differentialabundance/pull/105)] - Enabled multiple GMT/GMX files for GSEA ([@WackerO](https://github.com/WackerO), reported by [@grst](https://github.com/grst), review by [@pinin4fjords](https://github.com/pinin4fjords))
- [[#108](https://github.com/nf-core/differentialabundance/issues/108)] - Add shiny app generation (starting feature set) ([@pinin4fjords](https://github.com/pinin4fjords), review by [@WackerO](https://github.com/WackerO))
- [[#110](https://github.com/nf-core/differentialabundance/pull/110)] - Add shiny app outputs to tower.yml ([@pinin4fjords](https://github.com/pinin4fjords), review by [@WackerO](https://github.com/WackerO), [@maxulysse](https://github.com/maxulysse))

### `Fixed`

- [[#95](https://github.com/nf-core/differentialabundance/issues/95)] - Pipeline doesn't check for gene sets file specification when GSEA is activated ([@pinin4fjords](https://github.com/pinin4fjords), reported by [@danhalligan-hx](https://github.com/danhalligan-hx), review by [@FriederikeHanssen](https://github.com/FriederikeHanssen))
- [[#93](https://github.com/nf-core/differentialabundance/issues/93)] - Shouldn't be re-using the single exploratory palette across multiple informative variables ([@pinin4fjords](https://github.com/pinin4fjords), review by [@matthdsm](https://github.com/matthdsm))

## v1.1.1 - 2023-03-02

### `Fixed`
Expand Down
23 changes: 22 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,8 @@ On release, automated continuous integration tests run the pipeline on a full-si
3. Run differential analysis over all contrasts specified.
4. Optionally run a differential gene set analysis.
5. Generate exploratory and differential analysis plots for interpretation.
6. Build an HTML report based on R markdown, with interactive plots (where possible) and tables.
6. Optionally build and (if specified) deploy a Shiny app for fully interactive mining of results.
7. Build an HTML report based on R markdown, with interactive plots (where possible) and tables.

## Quick Start

Expand Down Expand Up @@ -73,6 +74,26 @@ Affymetrix microarray:
-profile affy,<docker/singularity/podman/shifter/charliecloud/conda/institute>
```

### Reporting

The pipeline reports its outcomes in two forms.

#### Markdown-derived HTML report

![screenshot of the markdown report](docs/images/markdown_report.png "Markdown report")

The primary workflow output is an HTML-format report produced from an [R markdown template](assets/differentialabundance_report.Rmd). This leverages helper functions from [shinyngs](https://github.com/pinin4fjords/shinyngs) to produce rich plots and tables, but does not provide significant interactivity.

#### Shiny-based data mining app

A second optional output is produced by leveraging [shinyngs](https://github.com/pinin4fjords/shinyngs) to build an interactive Shiny application. This allows more interaction with the data, setting of thresholds etc.

![screenshot of the ShinyNGS contrast table](docs/images/shinyngs_contrast_table.png "ShinyNGS contrast table")

![screenshot of the ShinyNGS gene plot](docs/images/shinyngs_gene_plot.png "ShinyNGS gene plot")

By default the application is provided as an R script and associated serialised data structure, which you can use to quickly start the application locally. With proper configuration the app can also be deployed to [shinyapps.io](https://www.shinyapps.io/) - though this requires you to have an account on that service (free tier available).

## Documentation

The nf-core/differentialabundance pipeline comes with documentation about the pipeline [usage](https://nf-co.re/differentialabundance/usage), [parameters](https://nf-co.re/differentialabundance/parameters) and [output](https://nf-co.re/differentialabundance/output).
Expand Down
96 changes: 29 additions & 67 deletions assets/differentialabundance_report.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,10 @@ params:
study_type: NULL
study_name: NULL
study_abundance_type: NULL
report_file: NULL,
report_title: NULL,
report_author: NULL,
report_description: NULL,
observations_type: NULL
observations: NULL # GSE156533.samplesheet.csv
observations_id_col: NULL
Expand Down Expand Up @@ -86,6 +90,7 @@ params:
differential_max_pval: NULL
differential_max_qval: NULL
differential_palette_name: NULL
differential_subset_to_contrast_samples: NULL
deseq2_test: NULL
deseq2_fit_type: NULL
deseq2_sf_type: NULL
Expand Down Expand Up @@ -139,59 +144,6 @@ library(DT)
datatable(NULL)
```

```{r, echo=FALSE}
# this function will be available via shinyngs in a release soon but we can use it here for now
anova_pca_metadata <- function(pca_coords, pcameta, fraction_explained){
# Use 10 components or however many fewer is produced by the PCA
last_pc <- 10
if (ncol(pca_coords) < last_pc) {
last_pc <- ncol(pca_coords)
}
# Remove non-useful variables (those with 1 value, or N values where N is the
# number of samples)
pcameta <- pcameta[, chooseGroupingVariables(pcameta), drop = FALSE]
# Run anova for all PCA against the selected meta vars
run_anova <- function(meta_col, pc){
fit <- aov(pca_coords[, pc] ~ factor(pcameta[, meta_col]))
smry <- summary(fit)[[1]]
if ("Pr(>F)" %in% names(smry)) {
smry[["Pr(>F)"]][[1]]
}else{
NA
}
}
pvals <- outer(
1:ncol(pcameta),
1:last_pc,
Vectorize(run_anova)
)
# Name dimensions
dimnames(pvals) <- list(
colnames(pcameta),
paste(
paste("PC", 1:last_pc, sep = ""),
" (",
fraction_explained[1:last_pc],
"%)",
sep = ""
)
)
pvals
}
```


```{r, include=FALSE}
versions <- unlist(yaml.load_file(file.path(params$input_dir, params$versions_file)), recursive = FALSE)
params_table <- data.frame(Parameter = names(unlist(params)), Value = unlist(params), row.names = NULL)
Expand All @@ -215,11 +167,13 @@ make_params_table <- function(name, pattern = NULL, remove_pattern = FALSE){
print( htmltools::tagList(datatable(subparams, caption = paste("Parameters used for", name), rownames = FALSE, options = list(dom = dom)) ))
}
report_title <- paste0('Differential ', params$features_type, ' abundance report', ifelse(is.null(params$report_title), '', paste0(': ', params$report_title)))
report_subtitle <- paste0(ifelse(is.null(params$report_author), '', paste0('By ', params$report_author, ', ')), 'differentialabundance workflow version', versions[["Workflow.nf-core/differentialabundance"]])
```

---
title: "<img src=\"`r file.path(params$input_dir, params$logo)`\" style=\"float: left;\"/>Differential `r params$features_type` abundance report"
subtitle: differentialabundance workflow version `r versions[["Workflow.nf-core/differentialabundance"]]`
title: "<img src=\"`r file.path(params$input_dir, params$logo)`\" style=\"float: left;\"/>`r report_title`"
subtitle: `r report_subtitle`
---

<!-- set notebook defaults -->
Expand Down Expand Up @@ -530,6 +484,7 @@ for (assay_type in rev(names(assay_data))){
observations[[iv]],
levels = unique(observations[[iv]])
)
pcaColorScale <- makeColorScale(length(unique(observations[[iv]])), palette = params$exploratory_palette_name)
# Make plotting data combining PCA coords with coloring groups etc
Expand All @@ -551,7 +506,7 @@ for (assay_type in rev(names(assay_data))){
ylab = labels[2],
colorby = plotdata$colorby,
plot_type = plot_types[[d]],
palette = groupColorScale,
palette = pcaColorScale,
legend_title = prettifyVariablename(iv),
labels = plotdata$name,
show_labels = TRUE
Expand Down Expand Up @@ -621,6 +576,8 @@ for (assay_type in rev(names(assay_data))){
cat(paste0("\n##### ", prettifyVariablename(assay_type), " (", iv, ")\n"))
variable_genes <- selectVariableGenes(matrix = assay_data[[assay_type]], ntop = params$exploratory_n_features)
dendroColorScale <- makeColorScale(length(unique(observations[[iv]])), palette = params$exploratory_palette_name)
p <- clusteringDendrogram(
2^assay_data[[assay_type]][variable_genes, ],
observations[, iv, drop = FALSE],
Expand All @@ -633,7 +590,7 @@ for (assay_type in rev(names(assay_data))){
params$features_type,
"s\n(", params$exploratory_clustering_method, " clustering, ", params$exploratory_cor_method, " correlation)"),
cluster_method = params$exploratory_clustering_method,
palette = groupColorScale,
palette = dendroColorScale,
labelspace = 0.25
)
# Defaults in shinyngs make the text in this plot a bit big for the report, so
Expand Down Expand Up @@ -812,17 +769,22 @@ if (any(unlist(params[paste0(possible_gene_set_methods, '_run')]))){
if (unlist(params[paste0(gene_set_method, '_run')])){
cat("\n### ", toupper(gene_set_method) ," {.tabset}\n")
reference_gsea_tables <- paste0(contrasts$id, '.gsea_report_for_', contrasts$reference, '.tsv')
target_gsea_tables <- paste0(contrasts$id, '.gsea_report_for_', contrasts$target, '.tsv')
for (gmt_file in simpleSplit(params$gsea_gene_sets)) {
gmt_name <- basename(tools::file_path_sans_ext(gmt_file))
for (i in 1:nrow(contrasts)){
cat("\n#### ", contrast_descriptions[i], "\n")
target_gsea_results <- read_metadata(target_gsea_tables[i])[,c(-2,-3)]
print( htmltools::tagList(datatable(target_gsea_results, caption = paste0("\nTarget (", contrasts$target[i], ")\n"), rownames = FALSE) ))
ref_gsea_results <- read_metadata(reference_gsea_tables[i])[,c(-2,-3)]
print( htmltools::tagList(datatable(ref_gsea_results, caption = paste0("\nReference (", contrasts$reference[i], ")\n"), rownames = FALSE) ))
cat("\n#### ", gmt_name ," {.tabset}\n")
reference_gsea_tables <- paste0(contrasts$id, ".", gmt_name, '.gsea_report_for_', contrasts$reference, '.tsv')
target_gsea_tables <- paste0(contrasts$id, ".", gmt_name, '.gsea_report_for_', contrasts$target, '.tsv')
for (i in 1:nrow(contrasts)){
cat("\n##### ", contrast_descriptions[i], "\n")
target_gsea_results <- read_metadata(target_gsea_tables[i])[,c(-2,-3)]
print( htmltools::tagList(datatable(target_gsea_results, caption = paste0("\nTarget (", contrasts$target[i], ")\n"), rownames = FALSE) ))
ref_gsea_results <- read_metadata(reference_gsea_tables[i])[,c(-2,-3)]
print( htmltools::tagList(datatable(ref_gsea_results, caption = paste0("\nReference (", contrasts$reference[i], ")\n"), rownames = FALSE) ))
}
}
}
}
Expand Down
4 changes: 4 additions & 0 deletions conf/affy.config
Original file line number Diff line number Diff line change
Expand Up @@ -35,4 +35,8 @@ params {
differential_qval_column = "adj.P.Val"
differential_feature_id_column = "probe_id"
differential_feature_name_column = "SYMBOL"

// A small amount of upstream work is required to get the app building
// working for arrays
shinyngs_build_app = true
}
53 changes: 40 additions & 13 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -141,10 +141,10 @@ process {
"--vst_nsub $params.deseq2_vst_nsub",
"--shrink_lfc $params.deseq2_shrink_lfc",
"--cores $params.deseq2_cores",
"--contrast_variable \"$meta.variable\"",
"--reference_level \"$meta.reference\"",
"--treatment_level \"$meta.target\"",
"--blocking_variables \"$meta.blocking\""
"--subset_to_contrast_samples $params.differential_subset_to_contrast_samples",
((meta.blocking == null) ? '' : "--blocking_variables $meta.blocking"),
((meta.exclude_samples_col == null) ? '' : "--exclude_samples_col $meta.exclude_samples_col"),
((meta.exclude_samples_values == null) ? '' : "--exclude_samples_values $meta.exclude_samples_values")
].join(' ').trim() }
}

Expand Down Expand Up @@ -182,23 +182,23 @@ process {
"--p.value ${params.limma_p_value}",
"--lfc ${params.limma_lfc}",
"--confint ${params.limma_confint}",
"--contrast_variable \"$meta.variable\"",
"--reference_level \"$meta.reference\"",
"--treatment_level \"$meta.target\"",
"--blocking_variables \"$meta.blocking\""
"--subset_to_contrast_samples $params.differential_subset_to_contrast_samples",
((meta.blocking == null) ? '' : "--blocking_variables $meta.blocking"),
((meta.exclude_samples_col == null) ? '' : "--exclude_samples_col $meta.exclude_samples_col"),
((meta.exclude_samples_values == null) ? '' : "--exclude_samples_values $meta.exclude_samples_values")
].join(' ').trim() }
}

withName: GSEA_GSEA {
ext.prefix = { "${meta.id}." }
ext.prefix = { "${meta.id}.${gene_sets.baseName}." }
publishDir = [
[
path: { "${params.outdir}/tables/gsea/${meta.id}" },
path: { "${params.outdir}/tables/gsea/${meta.id}/${gene_sets.baseName}" },
mode: params.publish_dir_mode,
pattern: '*gsea_report_for_*.tsv'
],
[
path: { "${params.outdir}/plots/gsea/${meta.id}" },
path: { "${params.outdir}/plots/gsea/${meta.id}/${gene_sets.baseName}" },
mode: params.publish_dir_mode,
pattern: '*.png'
]
Expand Down Expand Up @@ -260,6 +260,33 @@ process {
].join(' ').trim() }
}

withName: SHINYNGS_APP {
secret = (params.shinyngs_deploy_to_shinyapps_io) ? [ 'SHINYAPPS_TOKEN', 'SHINYAPPS_SECRET' ]: null

publishDir = [
path: { "${params.outdir}/shinyngs_app" },
mode: params.publish_dir_mode,
]
memory = { check_max( 12.GB * task.attempt, 'memory' ) }
ext.args = { [
"--assay_names \"${params.exploratory_assay_names}\"",
"--sample_id_col \"${params.observations_id_col}\"",
"--feature_id_col \"${params.features_id_col}\"",
"--diff_feature_id_col \"${params.differential_feature_id_column}\"",
"--fold_change_column \"${params.differential_fc_column}\"",
"--pval_column \"${params.differential_pval_column}\"",
"--qval_column \"${params.differential_qval_column}\"",
"--unlog_foldchanges \"${params.differential_foldchanges_logged}\"",
((params.report_title == null) ? '' : "--title \"$params.report_title\""),
((params.report_author == null) ? '' : "--author \"$params.report_author\""),
((params.report_description == null) ? '' : "--description \"$params.report_description\""),
((params.shinyngs_guess_unlog_matrices) ? "--guess_unlog_matrices" : ''),
((params.shinyngs_deploy_to_shinyapps_io) ? "--deploy_app" : ''),
((params.shinyngs_shinyapps_account == null) ? '' : "--shinyapps_account \"$params.shinyngs_shinyapps_account\""),
((params.shinyngs_shinyapps_app_name == null) ? '' : "--shinyapps_name \"$params.shinyngs_shinyapps_app_name\"")
].join(' ').trim() }
}

withName: CUSTOM_DUMPSOFTWAREVERSIONS {
publishDir = [
path: { "${params.outdir}/pipeline_info" },
Expand All @@ -269,8 +296,8 @@ process {
}

withName: RMARKDOWNNOTEBOOK {
conda = "bioconda::r-shinyngs=1.5.5"
container = { "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? 'https://depot.galaxyproject.org/singularity/r-shinyngs:1.5.5--r42hdfd78af_0':'quay.io/biocontainers/r-shinyngs:1.5.5--r42hdfd78af_0' }" }
conda = "bioconda::r-shinyngs=1.7.1"
container = { "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? 'https://depot.galaxyproject.org/singularity/r-shinyngs:1.7.1--r42hdfd78af_1':'quay.io/biocontainers/r-shinyngs:1.7.1--r42hdfd78af_1' }" }
publishDir = [
path: { "${params.outdir}/report" },
mode: params.publish_dir_mode,
Expand Down
5 changes: 5 additions & 0 deletions conf/test_full.config
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,11 @@ params {
// Change palette
exploratory_palette_name = 'Dark2'

// Set reporting parameters
report_title = "full tests"
report_author = "nf-core elves"
report_description = "This is a full-sized test dataset contributed by Oskar Wacker"

// Activate GSEA
gsea_run = true
gsea_gene_sets = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/mus_musculus/gene_set_analysis/mh.all.v2022.1.Mm.symbols.gmt'
Expand Down
Binary file added docs/images/markdown_report.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/shinyngs_contrast_table.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/shinyngs_gene_plot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/images/workflow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 3a849f0

Please sign in to comment.