Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add limma for rnaseq #286

Merged
merged 50 commits into from
Oct 28, 2024
Merged
Show file tree
Hide file tree
Changes from 48 commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
d82ab62
Initial limma integration
KamilMaliszArdigen Jun 19, 2024
23bcfc9
Initial limma integration - typo fix
KamilMaliszArdigen Jun 19, 2024
306f6de
Updated nextflow schema and limma selection logic
KamilMaliszArdigen Jun 28, 2024
b1425da
Updated pipeline documentation - pipeline diagram now with new path
KamilMaliszArdigen Jun 28, 2024
99142bd
Updated new param to follow convention
KamilMaliszArdigen Jun 28, 2024
db4e959
Lima integration wip
KamilMaliszArdigen Jul 3, 2024
5cfe108
Lima integration wip
KamilMaliszArdigen Jul 3, 2024
1f17b56
Rnaseq limaa parwaise implementation
KamilMaliszArdigen Aug 30, 2024
2116500
updated nextflow_schema.json
KamilMaliszArdigen Sep 3, 2024
aebbf38
Fix for test data to allow testing GSEA
KamilMaliszArdigen Sep 10, 2024
68f8091
Fix for test soft
KamilMaliszArdigen Sep 10, 2024
3606b76
Fix for limma test
KamilMaliszArdigen Sep 10, 2024
bec398a
Fix for wrong header in normalised counts
KamilMaliszArdigen Sep 11, 2024
c9c336a
Fix for double run of GSEA
KamilMaliszArdigen Sep 12, 2024
c5ccd76
Lima mixed implementation
KamilMaliszArdigen Sep 30, 2024
a701082
Lima mixed implementation default config
KamilMaliszArdigen Oct 1, 2024
d0197a7
Limma mixed bug fix
KamilMaliszArdigen Oct 1, 2024
0971953
Refactor for local limma module
KamilMaliszArdigen Oct 9, 2024
d35d5c3
Revert of nf-core limma module
KamilMaliszArdigen Oct 9, 2024
0fdc251
modules/nf-core/limma/differential/meta.yml - sync
KamilMaliszArdigen Oct 10, 2024
3b3cac6
modules/nf-core/limma/differential/meta.yml - sync
KamilMaliszArdigen Oct 10, 2024
bfbd3c0
Updates related to nf-core lint
KamilMaliszArdigen Oct 10, 2024
648bed2
environment update
KamilMaliszArdigen Oct 11, 2024
2865274
linting correction
KamilMaliszArdigen Oct 11, 2024
ff445d0
linting correction
KamilMaliszArdigen Oct 11, 2024
9df5607
Limma module update with adjustments for rnaseq data
KamilMaliszArdigen Oct 24, 2024
4338d67
Merge branch 'add_limma_for_rnaseq' of github.com:nf-core/differentia…
KamilMaliszArdigen Oct 24, 2024
fb8fd0f
Merge branch 'dev' into add_limma_for_rnaseq
KamilMaliszArdigen Oct 24, 2024
763e219
schema correction
KamilMaliszArdigen Oct 24, 2024
09702af
turn on limma tests in CI
KamilMaliszArdigen Oct 24, 2024
738be0f
addition of skipped tests dir during update
KamilMaliszArdigen Oct 24, 2024
5087fc0
Template fix
KamilMaliszArdigen Oct 24, 2024
77f1db0
Template fix
KamilMaliszArdigen Oct 24, 2024
9b89ec3
Config cleanup
KamilMaliszArdigen Oct 24, 2024
194350f
Update nextflow_schema.json
KamilMaliszArdigen Oct 25, 2024
b71c74c
Update workflows/differentialabundance.nf
KamilMaliszArdigen Oct 25, 2024
9476a08
Update nextflow.config
KamilMaliszArdigen Oct 25, 2024
c047acb
Update nextflow.config
KamilMaliszArdigen Oct 25, 2024
ede6a92
Workflow cleanup
KamilMaliszArdigen Oct 25, 2024
bfa4e42
Changelog update
KamilMaliszArdigen Oct 25, 2024
d76f4b2
Fix for doubled GSEA run afret refactoring and config cleanup
KamilMaliszArdigen Oct 25, 2024
717aaa6
Fix for pipeline diagram
KamilMaliszArdigen Oct 26, 2024
12f8f36
Fix for pipeline diagram
KamilMaliszArdigen Oct 26, 2024
f13acb2
Documentation updates
KamilMaliszArdigen Oct 27, 2024
00fef70
[automated] Fix code linting
nf-core-bot Oct 27, 2024
3f441df
Documentation updates
KamilMaliszArdigen Oct 28, 2024
d991e50
Merge branch 'add_limma_for_rnaseq' of github.com:nf-core/differentia…
KamilMaliszArdigen Oct 28, 2024
8bff062
[automated] Fix code linting
nf-core-bot Oct 28, 2024
854b81d
Update README.md
KamilMaliszArdigen Oct 28, 2024
d4fca39
Update README.md
KamilMaliszArdigen Oct 28, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ jobs:
- "test_affy"
- "test_maxquant"
- "test_soft"
- "test_rnaseq_limma"
compute_profile:
- "conda"
- "docker"
Expand Down
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Added

- [[#286](https://github.com/nf-core/differentialabundance/pull/286)] - Integration of limma voom for rnaseq data ([@KamilMaliszArdigen](https://github.com/KamilMaliszArdigen), review by [@pinin4fjords](https://github.com/pinin4fjords))

### Fixed

- [[#304](https://github.com/nf-core/differentialabundance/pull/304)] - Removed TXT file options from nextflow_schema where they are equivalent to TSV to make the input files clearer ([@WackerO](https://github.com/WackerO), review by [@pinin4fjords](https://github.com/pinin4fjords))
Expand Down
23 changes: 22 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ On release, automated continuous integration tests run the pipeline on a full-si
> [!NOTE]
> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data.

RNA-seq:
RNA-seq with deseq2:

```bash
nextflow run nf-core/differentialabundance \
Expand All @@ -60,6 +60,27 @@ If you are using the outputs of the nf-core rnaseq workflow as input here **eith
- supply the raw count matrices (file names like **gene_counts.tsv**) alongide the transcript length matrix via `--transcript_length_matrix` (rnaseq versions >=3.12.0, preferred)
- **or** supply the **gene_counts_length_scaled.tsv** or **gene_counts_scaled.tsv** matrices.

RNA-seq limma+voom:

```bash
nextflow run nf-core/differentialabundance \
--input samplesheet.csv \
--contrasts contrasts.csv \
--matrix assay_matrix.tsv \
--gtf mouse.gtf \
--outdir <OUTDIR> \
-profile rnaseq_limma,<docker/singularity/podman/shifter/charliecloud/conda/institute>
```

KamilMaliszArdigen marked this conversation as resolved.
Show resolved Hide resolved
:::note
If you are using the outputs of the nf-core rnaseq workflow as input here **either**:

Provide either the **gene_counts_length_scaled.tsv** or **gene_counts_scaled.tsv** matrices. This follows the [recommendation from the tximport documentation](https://bioconductor.org/packages/devel/bioc/vignettes/tximport/inst/doc/tximport.html#limma-voom):
KamilMaliszArdigen marked this conversation as resolved.
Show resolved Hide resolved

> "Because limma-voom does not use the offset matrix stored in `y$offset`, we recommend using scaled counts generated from abundances, either 'scaledTPM' or 'lengthScaledTPM'."

These matrices, **gene_counts_length_scaled.tsv** or **gene_counts_scaled.tsv**, are generated in the RNA-seq workflow and meet this recommendation by providing appropriately scaled counts for analysis.
KamilMaliszArdigen marked this conversation as resolved.
Show resolved Hide resolved

See the [usage documentation](https://nf-co.re/differentialabundance/usage) for more information.
:::

Expand Down
2 changes: 2 additions & 0 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -274,6 +274,7 @@ process {
"--lfc ${params.limma_lfc}",
"--confint ${params.limma_confint}",
"--subset_to_contrast_samples $params.differential_subset_to_contrast_samples",
"--use_voom \"${params.limma_use_voom}\"",
((meta.blocking == null) ? '' : "--blocking_variables $meta.blocking"),
((meta.exclude_samples_col == null) ? '' : "--exclude_samples_col $meta.exclude_samples_col"),
((meta.exclude_samples_values == null) ? '' : "--exclude_samples_values $meta.exclude_samples_values")
Expand Down Expand Up @@ -437,6 +438,7 @@ process {
pattern: '*.html'
]
]
memory = { 12.GB * task.attempt }
}

withName: MAKE_REPORT_BUNDLE {
Expand Down
36 changes: 36 additions & 0 deletions conf/rnaseq_limma.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Nextflow config file for running RNA-seq analysis
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Defines settings specific to RNA-seq analysis

Use as follows:
nextflow run nf-core/differentialabundance -profile rnaseq_limma,<docker/singularity> --outdir <OUTDIR>

----------------------------------------------------------------------------------------
*/

params {

config_profile_name = 'RNA-seq profile with limma'
config_profile_description = 'Settings for RNA-seq analysis with limma'

// Study
study_type = 'rnaseq'
study_abundance_type = 'counts'

// Observations
observations_id_col = 'sampleId'
observations_name_col = 'sampleId'

// Differential options
differential_use_limma = true
differential_file_suffix = ".limma.results.tsv"
differential_fc_column = "logFC"
differential_pval_column = "P.Value"
differential_qval_column = "adj.P.Val"
differential_feature_id_column = "Geneid"
differential_feature_name_column = "Geneid"
limma_use_voom = true

}
67 changes: 67 additions & 0 deletions conf/test_rnaseq_limma.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Nextflow config file for running RNA-seq analysis
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Defines settings specific to RNA-seq analysis

Use as follows:
nextflow run nf-core/differentialabundance -profile test_rnaseq_limma,<docker/singularity> --outdir <OUTDIR>

----------------------------------------------------------------------------------------
*/

includeConfig 'rnaseq_limma.config'


params {
study_name = 'SRP254919'
config_profile_name = 'Test profile'
config_profile_description = 'Minimal test dataset to check pipeline function'

// Input data
input = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/mus_musculus/rnaseq_expression/SRP254919.samplesheet.csv'
matrix = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/mus_musculus/rnaseq_expression/SRP254919.salmon.merged.gene_counts.top1000cov.tsv'
transcript_length_matrix = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/mus_musculus/rnaseq_expression/SRP254919.spoofed_lengths.tsv'
contrasts = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/mus_musculus/rnaseq_expression/SRP254919.contrasts.csv'

// To do: replace this with a cut-down mouse GTF matching the matrix for testing
gtf = 'https://ftp.ensembl.org/pub/release-81/gtf/mus_musculus/Mus_musculus.GRCm38.81.gtf.gz'

// Observations
observations_id_col = 'sample'
observations_name_col = 'sample'

// Features
features_type = 'gene'
features_id_col = 'gene_id'
features_name_col = 'gene_name'
features_metadata_cols = 'gene_id,gene_name,gene_biotype'

// Differential options
differential_feature_id_column = "gene_id"
differential_feature_name_column = "gene_name"
differential_fc_column = "logFC"

// Apply a higher filter to check that the filtering works
filtering_min_abundance=10

// Exploratory
exploratory_assay_names = "raw,normalised"
exploratory_final_assay = "normalised"
exploratory_log2_assays = 'raw,normalised'
exploratory_main_variable = 'contrasts'

// Test dataset is too small for the nsub default value
deseq2_vst_nsub = 500

// Activate GSEA
gsea_run = true
gene_sets_files = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/mus_musculus/gene_set_analysis/mh.all.v2022.1.Mm.symbols.gmt'

// Report options
report_round_digits = 3
report_contributors = 'Jane Doe\nDirector of Institute of Microbiology\nUniversity of Smallville;John Smith\nPhD student\nInstitute of Microbiology\nUniversity of Smallville'

}


Binary file modified docs/images/workflow.png
pinin4fjords marked this conversation as resolved.
Show resolved Hide resolved
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Loading