Error in dimnames(covmat.unscaled) <- list(xnames, xnames) : length of 'dimnames' [1] not equal to array extent #9

charlesgwellem · 2019-08-06T14:42:54Z

Hi. I will be very grateful for any hint on how to overcome the error. I wish to deconvolve my bulk RNA seq data from the lungs of mice using single cell RNA seq data. For practice, I am following your tutorial using my own bulk seq data and the single cell rna seq data you have provided. All seems to go well until when I try to obtain the Cell type-specific differential expression by running the following code. From the beginning I thought the presence of factors(e.g. for th treatment conditions or phenoData) in my data set was the cause but after getting rid of the factors I still get the same error:

csfit <- bseqsc_csdiff(the_best_ilc_bulk_expr_set[genes, ] ~ phenoData| alpha + beta + ductal + acinar,  
                       verbose = 2, nperms = 5000, .rng = 12345)`

Then I get this error:

Groups: bleo=5L | bleo_ko_i=5L | bleo_wt_i=5L | Nacl=5L | NA0L
Cell type(s): 'alpha', 'beta', ..., 'acinar' (4 total)
Fitting mode: auto
Data (filtered): 1202 features x 20 samples
Model has factor effect with more than 2 levels: fitting lm interaction model
Fitting model with nonnegative effects
Model with more than 2 groups: switching to version 2
 Fitting linear interaction model ... Error in dimnames(covmat.unscaled) <- list(xnames, xnames) : 
  length of 'dimnames' [1] not equal to array extent
In addition: Warning messages:
1: In lsfit(D, G, intercept = FALSE) : 'X' matrix was collinear
2: In sqrt(((n - p) * stddevmat^2 - resids^2/(1 - hatdiag[good]))/(n -  :
  NaNs produced

Here are the details of my code:

# inspired from https://github.com/cozygene/bisque/issues/4

# creating expression set from my bulk rna seq data

library(openxlsx)
library(Biobase)
library(bseqsc)
library(tidyverse)
bseqsc_config('CIBERSORT.R')
data(pancreasMarkers)


# read in the raw data

bulk <- read.xlsx("all_samples.xlsx",
                  sheet = "rawCounts_unrepeated_genes")
head(bulk)

bulk$Gene <- NULL
rownames(bulk) <- toupper(bulk$external_gene_name)
bulk$external_gene_name <- NULL
bulk <- as.matrix(bulk)
head(bulk)


# create expression matrix 

featureData <- as.character(rownames(bulk))
featureData <- as(data.frame(featureData, 
                             stringsAsFactors = FALSE), "AnnotatedDataFrame")
rownames(featureData) <- rownames(bulk)
phenoData <- c(rep("bleo", 5), rep("Nacl", 5), rep("bleo_wt_i", 5), rep("bleo_ko_i", 5))
phenoData <- as(as.data.frame(phenoData), "AnnotatedDataFrame")
rownames(phenoData) <- colnames(bulk)
the_best_ilc_bulk_expr_set <- ExpressionSet(assayData = bulk, 
                                            phenoData = phenoData, featureData = featureData)

# read in the single cell rna seq data and fit the model

eislet <- readRDS('islet-eset.rds')
B <- bseqsc_basis(eislet, pancreasMarkers, 
                  clusters = 'cellType', samples = 'sampleID', ct.scale = TRUE)
fit <- bseqsc_proportions(the_best_ilc_bulk_expr_set, B, verbose = TRUE)


pData(the_best_ilc_bulk_expr_set) <- cbind(pData(the_best_ilc_bulk_expr_set), t(coef(fit)))

fit_edger<- fitEdgeR(the_best_ilc_bulk_expr_set, ~phenoData, 
                     coef = c("phenoDatableo_ko_i", "phenoDatableo_wt_i", "phenoDataNacl"))

# extended

fit_edger_ext <- fitEdgeR(the_best_ilc_bulk_expr_set, ~ phenoData + beta + ductal + acinar +gamma,
                          coef = c("phenoDatableo_ko_i", "phenoDatableo_wt_i",
                                   "phenoDataNacl", "beta", "ductal", 
                                   "acinar", "gamma"))


fit_edger_ext$Symbol <- rownames(fit_edger_ext)

# gather P-values from both models

df_fit_edger <- as.data.frame(fit_edger, stringsAsFactors = FALSE)
df_fit_edger$Symbol <- rownames(df_fit_edger)

req_df_fit_edger <- df_fit_edger["PValue"]
colnames(req_df_fit_edger) <- "Base"
head(req_df_fit_edger)
req_df_fit_edger$Symbol <- rownames(req_df_fit_edger)
df_fit_edger_ext <- as.data.frame(fit_edger_ext, stringsAsFactors = FALSE)
df_fit_edger_ext$Symbol <- rownames(df_fit_edger_ext)

head(df_fit_edger_ext)

req_df_fit_edger_ext <- df_fit_edger_ext["PValue"]
colnames(req_df_fit_edger_ext) <- "Adjusted"
req_df_fit_edger_ext$Symbol <- rownames(req_df_fit_edger_ext)


edger_pvals <- req_df_fit_edger_ext %>%
  inner_join(req_df_fit_edger)
head(edger_pvals)#

rownames(edger_pvals) <- edger_pvals$Symbol



edger_pvals <- mutate(edger_pvals, Regulated = Adjusted <= 0.01 & Adjusted <= Base)

# plot Base vs Adjusted
pvalueScatter(Base ~ Adjusted, edger_pvals, pval.th = 0.01, label.th = 3.5)

# ER genes
genes_ER <- c('HSPA5', 'MAFA', 'HERPUD1', 'DDIT3', 'UCN3', 'NEUROD1')
# Fit on ER stress genes and genes regulated beyond cell type proportion differences
genes <- union(genes_ER, subset(edger_pvals, Regulated)$Symbol)

csfit <- bseqsc_csdiff(the_best_ilc_bulk_expr_set[genes, ] ~ phenoData| alpha + beta + ductal + acinar,  
                       verbose = 2, nperms = 5000, .rng = 12345)

Then I have the error:

Groups: bleo=5L | bleo_ko_i=5L | bleo_wt_i=5L | Nacl=5L | NA0L
Cell type(s): 'alpha', 'beta', ..., 'acinar' (4 total)
Fitting mode: auto
Data (filtered): 1202 features x 20 samples
Model has factor effect with more than 2 levels: fitting lm interaction model
Fitting model with nonnegative effects
Model with more than 2 groups: switching to version 2
 Fitting linear interaction model ... Error in dimnames(covmat.unscaled) <- list(xnames, xnames) : 
  length of 'dimnames' [1] not equal to array extent
In addition: Warning messages:
1: In lsfit(D, G, intercept = FALSE) : 'X' matrix was collinear
2: In sqrt(((n - p) * stddevmat^2 - resids^2/(1 - hatdiag[good]))/(n -  :
  NaNs produced


> sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=de_DE.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=de_DE.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=de_DE.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] edgeR_3.26.6          limma_3.40.6          preprocessCore_1.46.0 e1071_1.7-2           forcats_0.4.0         stringr_1.4.0        
 [7] dplyr_0.8.3           purrr_0.3.2           readr_1.3.1           tidyr_0.8.3           tibble_2.1.3          ggplot2_3.2.0        
[13] tidyverse_1.2.1       bseqsc_1.0            csSAM_1.4             Rcpp_1.0.2            Biobase_2.44.0        BiocGenerics_0.30.0  
[19] openxlsx_4.1.0.1     

loaded via a namespace (and not attached):
 [1] nlme_3.1-140         lubridate_1.7.4      bit64_0.9-7          doParallel_1.0.14    RColorBrewer_1.1-2   httr_1.4.0          
 [7] tools_3.6.0          backports_1.1.4      R6_2.4.0             DBI_1.0.0            lazyeval_0.2.2       colorspace_1.4-1    
[13] withr_2.1.2          tidyselect_0.2.5     gridExtra_2.3        xbioc_0.1.17         bit_1.1-14           compiler_3.6.0      
[19] cli_1.1.0            rvest_0.3.4          xml2_1.2.1           pkgmaker_0.28        scales_1.0.0         NMF_0.22            
[25] digest_0.6.20        pkgconfig_2.0.2      bibtex_0.4.2         rlang_0.4.0          readxl_1.3.1         rstudioapi_0.10     
[31] RSQLite_2.1.2        generics_0.0.2       jsonlite_1.6         dendextend_1.12.0    zip_2.0.3            magrittr_1.5        
[37] Formula_1.2-3        munsell_0.5.0        S4Vectors_0.22.0     viridis_0.5.1        stringi_1.4.3        yaml_2.2.0          
[43] plyr_1.8.4           grid_3.6.0           blob_1.2.0           crayon_1.3.4         lattice_0.20-38      haven_2.1.1         
[49] splines_3.6.0        hms_0.5.0            locfit_1.5-9.1       zeallot_0.1.0        pillar_1.4.2         rngtools_1.4        
[55] reshape2_1.4.3       codetools_0.2-16     stats4_3.6.0         glue_1.3.1           BiocManager_1.30.4   modelr_0.1.4        
[61] vctrs_0.2.0          foreach_1.4.7        cellranger_1.1.0     gtable_0.3.0         assertthat_0.2.1     gridBase_0.4-7      
[67] xtable_1.8-4         broom_0.5.2          class_7.3-15         viridisLite_0.3.0    iterators_1.0.12     AnnotationDbi_1.46.0
[73] registry_0.5-1       memoise_1.1.0        IRanges_2.18.1       cluster_2.1.0

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in dimnames(covmat.unscaled) <- list(xnames, xnames) : length of 'dimnames' [1] not equal to array extent #9

Error in dimnames(covmat.unscaled) <- list(xnames, xnames) : length of 'dimnames' [1] not equal to array extent #9

charlesgwellem commented Aug 6, 2019 •

edited

Loading

Error in dimnames(covmat.unscaled) <- list(xnames, xnames) : length of 'dimnames' [1] not equal to array extent #9

Error in dimnames(covmat.unscaled) <- list(xnames, xnames) : length of 'dimnames' [1] not equal to array extent #9

Comments

charlesgwellem commented Aug 6, 2019 • edited Loading

charlesgwellem commented Aug 6, 2019 •

edited

Loading