Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error:infer_grn(), #42

Open
wxpbioinfo opened this issue Jun 24, 2023 · 5 comments
Open

Error:infer_grn(), #42

wxpbioinfo opened this issue Jun 24, 2023 · 5 comments

Comments

@wxpbioinfo
Copy link

Hi,When I was running this function, I encountered the following error, I checked my motif matrix and gene name, but I did not find a number beginning, I feel very confused, can you give me the answer?
image
This is my code:

scARC=readRDS("./Data/scARC_celltype.rds")
DefaultAssay(scARC) <- "peaks"
seqlevelsStyle(BSgenome.Mmulatta.UCSC.rheMac10) <- 'Ensembl'
scARC <- initiate_grn(scARC, rna_assay = 'RNA',peak_assay = 'peaks')
pwm_set <- getMatrixSet(x = JASPAR2022, opts = list(species = 9606, all_versions = FALSE))

plan("multisession", workers = 20)
#查找 TF 结合位点
scARC <- find_motifs(scARC,pfm = pwm_set,genome = BSgenome.Mmulatta.UCSC.rheMac10)
#推断 GRN
genes <- scARC@assays[["RNA"]]@var.features
filtered_text <- grep("1_.", x, value=TRUE)
genes <- genes[!grepl("^ID3.
", genes)]
scARC <- infer_grn(scARC,genes=genes,peak_to_gene_method = 'Signac',method = 'glm')
plan("sequential")
coef(scARC)

@elhaam
Copy link

elhaam commented Apr 18, 2024

Hello @joschif
Thank you for the detailed tutorials! I have a similar issue to the one reported above. I followed the tutorials and at this part of the code, I got an error when trying to infer the grn for highly variable genes. Removing the genes argument below did not help.

Package versions:

print(packageVersion("Seurat"))
[1] '5.0.3'
print(packageVersion("SeuratObject"))
[1] '5.0.1'
print(packageVersion("Pando"))
[1] '1.1.1'

library(doParallel)
registerDoParallel(4)
muo_data <- infer_grn(
  muo_data,
  peak_to_gene_method = 'GREAT',
  genes=top_variable_genes,
  verbose=2,
  tf_cor=0,
  #genes = patterning_genes$symbol
  parallel = T
)

Here is my error:

Selecting candidate regulatory regions near genes
Preparing model input
Fitting models for 1525 target genes
Error in { :
task 3 failed - "x and y should have the same number of rows"

I have tried many possible ways to solve this but I have not succeeded. Would you please help?

> muo_data
An object of class "GRNData"
Slot "grn":
A RegulatoryNetwork object based on 1136 transcription factors


No network has been inferred

Slot "data":
An object of class Seurat 
128093 features across 1136 samples within 2 assays 
Active assay: peaks (91492 features, 0 variable features)
 2 layers present: counts, data
 1 other assay present: RNA

I have my RNA and ATAC data as follows:

> coembed <- merge(x = pbmc_atac_filtered, y = rna_seurat)
> print(coembed)
An object of class Seurat 
128093 features across 1136 samples within 2 assays 
Active assay: peaks (91492 features, 0 variable features)
 2 layers present: counts, data
 1 other assay present: RNA
> coembed[['RNA']]
Assay (v5) data with 36601 features for 579 cells
Top 10 variable features:
 CXCL8, HIST1H2AC, AFF3, NRG1, PDE4D, IL1B, EREG, AL163541.1, ADGRB3, NEGR1 
Layers:
 counts, data 
> coembed[['peaks']]
ChromatinAssay data with 91492 features for 557 cells
Variable features: 0 
Genome: 
Annotation present: TRUE 
Motifs present: FALSE 
Fragment files: 0 

> muo_data <- initiate_grn(
  coembed,
  rna_assay = 'RNA',
  peak_assay = 'peaks',
  regions = phastConsElements20Mammals.UCSC.hg38 
)

I see I have 579 cells in RNA, but 557 in ATAC. I troubleshoot and updated this in another comment below.

Thank you very much.
Elham

@elhaam
Copy link

elhaam commented Apr 18, 2024

Hello @joschif

I am updating this issue. I tried keeping common cells within both assays so now both my RNA and ATAC data have 557 cells. The error I get changed as follows.

> registerDoParallel(4)
> muo_data <- infer_grn(
+   muo_data,
+   peak_to_gene_method = 'Signac', #GREAT',
+   genes=top_variable_genes,
+   verbose=2,
+   tf_cor=0,
+   #genes = patterning_genes$symbol
+   parallel = T
+ )

Loaded glmnet 4.1-8
Selecting candidate regulatory regions near genes
Preparing model input
Fitting models for 1525 target genes
Error in { :
task 3 failed - ""CRsparse_colSums" not resolved from current namespace (Matrix)"

Would you please let me know if you have any suggestions?
Thank you so much in advance.

@joschif
Copy link
Collaborator

joschif commented Apr 18, 2024

Hi @elhaam, unfortunately it's very hard to tell what the exact problem is here. However, it seems to stem not from the Pando code itself but from the Matrix package. Maybe you can try updating it or installing a different version.

@elhaam
Copy link

elhaam commented Apr 18, 2024

Thanks @joschif! Yes, this is correct that Matrix package was problematic. Following this solution and this one worked for me if anyone faced this issue in the future. Also, I made sure you have the correct version of Bioconductor based on this issue on Seurat.

@damouzo
Copy link

damouzo commented Sep 4, 2024

Same issue as @wxpbioinfo. I have tried with a few genes and I still get the error with genes other than ‘RSPO4’, but the same ‘20_’. The only thing different from the tutorial is the use of NCBI peak name style. Any suggestions for not having to do the preprocessing with the USCS style (because of the impossibility to change it in the seurat object). Very nice package by the way!

> grn_object <- infer_grn(grn_object, peak_to_gene_method = 'Signac', method = 'glm', verbose = T) 
Selecting candidate regulatory regions near genes 
Preparing model input 
Fitting models for 1278 target genes  
|+++                                               | 4 % ~01m 10s      Error en str2lang(x): <text>:1:11: unexpected input
1: RSPO4 ~ 20_
              ^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants