Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sample exclusion options fail due to contrast-wise normalisation #133

Closed
pinin4fjords opened this issue May 26, 2023 · 0 comments · Fixed by #132
Closed

Sample exclusion options fail due to contrast-wise normalisation #133

pinin4fjords opened this issue May 26, 2023 · 0 comments · Fixed by #132
Labels
bug Something isn't working

Comments

@pinin4fjords
Copy link
Member

Description of the bug

Reported at https://nfcore.slack.com/archives/C045UNCS5R9/p1685076556753279

The current strategy assumes that the normalised matrices produced by all contrasts with DESeq2 are the same. We simply take the first one and use it in the exploratory analysis etc.

This was fine before we introduced sample exclusion parameters were introduced, but now, each contrast could output a different normalised matrix, not appropriate for a global comparison with the normalised matrix in exploratory analysis.

Command used and terminal output

Errors like:


ERROR ~ Error executing process > 'NFCORE_DIFFERENTIALABUNDANCE:DIFFERENTIALABUNDANCE:PLOT_EXPLORATORY (condition)'

Caused by:
  Process `NFCORE_DIFFERENTIALABUNDANCE:DIFFERENTIALABUNDANCE:PLOT_EXPLORATORY (condition)` terminated with an error exit status (1)

Command executed:

  exploratory_plots.R \
      --sample_metadata "input_diff_subset.sample_metadata.tsv" \
      --feature_metadata "Homo_sapiens.anno.feature_metadata.tsv" \
      --assay_files "salmon.merged.gene_counts.assay.tsv,condition-RV-RV_il1-participant.normalised_counts.tsv,condition-RV-RV_il1-participant.vst.tsv" \
      --contrast_variable "condition" \
      --outdir "condition" \
      --sample_id_col "sample" --feature_id_col "gene_id" --assay_names "raw,normalised,variance_stabilised" --final_assay "variance_stabilised" --outlier_mad_threshold -5 --palette_name "Set1"
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_DIFFERENTIALABUNDANCE:DIFFERENTIALABUNDANCE:PLOT_EXPLORATORY":
      r-base: $(echo $(R --version 2>&1) | sed 's/^.*R version //; s/ .*$//')
      r-shinyngs: $(Rscript -e "library(shinyngs); cat(as.character(packageVersion('shinyngs')))")
  END_VERSIONS

Command exit status:
  1

Command output:
  [1] "Reading inputs..."

Command error:
      colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
      get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
      match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
      Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort,
      table, tapply, union, unique, unsplit, which.max, which.min
  
  Loading required package: S4Vectors
  
  Attaching package: 'S4Vectors'
  
  The following objects are masked from 'package:base':
  
      expand.grid, I, unname
  
  Loading required package: IRanges
  Loading required package: GenomeInfoDb
  Loading required package: Biobase
  Welcome to Bioconductor
  
      Vignettes contain introductory material; view with
      'browseVignettes()'. To cite Bioconductor, see
      'citation("Biobase")', and for packages 'citation("pkgname")'.
  
  
  Attaching package: 'Biobase'
  
  The following object is masked from 'package:MatrixGenerics':
  
      rowMedians
  
  The following objects are masked from 'package:matrixStats':
  
      anyMissing, rowMedians
  
  
  Attaching package: 'shinyngs'
  
  The following object is masked from 'package:MatrixGenerics':
  
      colMedians
  
  The following object is masked from 'package:matrixStats':
  
      colMedians
  
  [1] "Reading inputs..."
  Error in read_matrix(x, sample_metadata = sample_metadata, feature_metadata = feature_metadata,  : 
    Some sample metadata names (X112_1,X112_2,X112_4,X112_6,X140_2,X140_4,X140_6,X147_1,X147_2,X147_4,X147_6,X158_1,X158_2,X158_4,X158_6,X49_1,X49_2,X49_4,X49_6,X50_1,X50_2,X50_4,X50_6,X51_1,X51_2,X51_4,X51_6,X52_1,X52_2,X52_4,X52_6,X54_1,X54_2,X54_4,X54_6,X57_1,X57_2,X57_4,X57_6,X58_1,X65_1,X65_2,X65_4,X65_6,X67_1,X67_2,X67_4,X67_6,X68_1,X68_2,X68_4,X68_6,X87_1,X87_2,X87_4,X87_6,X91_1,X91_2,X91_4,X91_6,X98_1,X98_2,X98_4,X98_6) are absent from the matrix in condition-RV-RV_il1-participant.normalised_counts.tsv, columns are: X112_3,X112_5,X140_3,X140_5,X147_3,X147_5,X158_3,X158_5,X49_3,X49_5,X50_3,X50_5,X51_3,X52_3,X52_5,X54_3,X54_5,X57_3,X57_5,X58_3,X58_5,X65_3,X65_5,X67_3,X67_5,X68_3,X68_5,X87_3,X87_5,X91_3,X91_5,X98_3,X98_5
  Calls: lapply -> lapply -> FUN -> read_matrix
  Execution halted

Work dir:
  /home/ubuntu/scratch/raw_data/diff_abund/work/7d/1c4e82b589c0db6810de0bace1abf4

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

 -- Check '.nextflow.log' file for details


### Relevant files

_No response_

### System information

_No response_
@pinin4fjords pinin4fjords added the bug Something isn't working label May 26, 2023
@pinin4fjords pinin4fjords linked a pull request May 26, 2023 that will close this issue
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant