Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

all_versions = FALSE in JASPAR omits results #33

Open
nellykan opened this issue Jan 10, 2022 · 3 comments
Open

all_versions = FALSE in JASPAR omits results #33

nellykan opened this issue Jan 10, 2022 · 3 comments

Comments

@nellykan
Copy link

Hi!

I am trying this tool for the first time and I encountered the following issue.

When I am accessing the PWMs of mouse (Tax ID: 10090) with the option "all_versions=TRUE", I get results for various factors, among which for example Klf4 and Sox2.

> opts <- list()
> opts[["species"]] <- 10090
> opts[["collection"]] <- "CORE"
> opts[["all_versions"]] <- TRUE
> opts[["matrixtype"]] <- "PFM"
> PFMatrixList <- getMatrixSet(JASPAR2020, opts)
> PFMatrixList
PFMatrixList of length 196
names(196): MA0004.1 MA0006.1 MA0009.1 MA0014.1 MA0027.1 MA0029.1 ... MA1627.1 MA1628.1 MA1629.1 MA1630.1 MA0122.3 MA1684.1
> 
> TFs <- unlist((lapply(PFMatrixList@listData, slot, name = "name")))
> TFs[TFs == "Sox2"]
MA0143.1 MA0143.2 MA0143.3 
  "Sox2"   "Sox2"   "Sox2" 
> TFs[TFs == "Klf4"]
MA0039.1 MA0039.2 
  "Klf4"   "Klf4" 

However, with the default option "all_versions=FALSE", I do not get any result of these (and other) factors, even though the expected behavior would be to get only the latest version.

> opts <- list()
> opts[["species"]] <- 10090
> opts[["collection"]] <- "CORE"
> opts[["all_versions"]] <- FALSE
> opts[["matrixtype"]] <- "PFM"
> PFMatrixList <- getMatrixSet(JASPAR2020, opts)
> PFMatrixList
PFMatrixList of length 107
names(107): MA0004.1 MA0006.1 MA0029.1 MA0067.1 MA0078.1 MA0087.1 ... MA1627.1 MA1628.1 MA1629.1 MA1630.1 MA0122.3 MA1684.1
> 
> TFs <- unlist((lapply(PFMatrixList@listData, slot, name = "name")))
> TFs[TFs == "Sox2"]
named character(0)
> TFs[TFs == "Klf4"]
named character(0)

I would appreciate any help or tips. Thanks a lot!

Nelly

Session Info:
R version 4.1.1 (2021-08-10)
TFBSTools_1.32.0
JASPAR2020_0.99.10

@ge11232002
Copy link
Owner

Hi Nelly,

The species of certain morif got changes between versions, as well as the name. Please check the version information for Sox2
https://jaspar.genereg.net/matrix/MA0143.3/
I would omit the species option to fetch all.

library(JASPAR2020)
opts <- list()
opts[["species"]] <- NULL
opts[["collection"]] <- "CORE"
opts[["all_versions"]] <- FALSE
opts[["matrixtype"]] <- "PFM"
PFMatrixList <- getMatrixSet(JASPAR2020, opts)
PFMatrixList
which(name(PFMatrixList) == "Klf4")
which(name(PFMatrixList) == "KLF4")
which(name(PFMatrixList) == "Sox2")
which(name(PFMatrixList) == "SOX2")

@nellykan
Copy link
Author

Hi Ge Tan,

I am not sure how that solves the problem. I do not want the motif for any species, but for Mus musculus specifically, and only the latest mus musculus version. In other words, I think the filtering for the version should happen after selecting a species.

@ge11232002
Copy link
Owner

If you only want Mus musculus version, then you will have to pick from "all_versions" as you did initially, as the latest version can be from human.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants