Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

genomeName error in getCTSS #121

Open
ckuanglim85 opened this issue Aug 13, 2024 · 1 comment
Open

genomeName error in getCTSS #121

ckuanglim85 opened this issue Aug 13, 2024 · 1 comment

Comments

@ckuanglim85
Copy link

ckuanglim85 commented Aug 13, 2024

Hello,
I have some CAGE sequencing data from oil palm. The reads were mapped to the genome and I have the alignment BAM file.
I load my BAM files with CAGEexp. Then, I have problem running getCTSS.

/> inputFiles <- c("/home/chankl/abiotic_stress/CAGEr/PK7009.sorted.bam", "/home/chankl/abiotic_stress/CAGEr/PK7011.sorted.bam")
/> myCAGEexp <- CAGEr:::CAGEexp(inputFiles=inputFiles, inputFilesType="bam", sampleLabels=c("PK7009","PK7011"))
/> myCAGEctss <- CAGEr:::getCTSS(myCAGEexp, useMulticore=TRUE, nrCores=32, correctSystematicG=FALSE)

Reading in file: /home/chankl/abiotic_stress/CAGEr/PK7009.sorted.bam...
-> Filtering out low quality reads...
-> Removing the first base of the reads if 'G' and not aligned to the genome...

Stop worker failed with the error: wrong args for environment subassignment
Error: BiocParallel errors
1 remote errors, element index: 1
1 unevaluated and other errors
first remote error:
Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'getSeq': Can not run this function with a NULL genome; see 'help("genomeName")'.

Look like a problem in genomeName. How can I load this custom non-model species genome and use CAGEr?
Thanks in advance.

@charles-plessy
Copy link
Owner

Hi, you also need to set removeFirstG=FALSE if you do not have a BSgenome package for your species. We are working on supporting FASTA file input instead, but I do not know when it will be ready.

Quick comments on your code:

  • You do not need to use the private accessor ::: to call the functions. If you really want the package name to be prefixed instead of calling library(CAGEr) use :: instead.
  • If you have only two samples, you only need two cores for the parallelisation. Note that parallelising on a large number of samples risks to exhaust available memory.
  • The idiomatic way of running the core CAGEr functions is to overwrite the input object, as these function produces an output which is a copy of the input plus some additions. Something line ce <- getCTSS(ce, removeFirstG=FALSE).

Lastly, in GitHub Markdown, you can use the triple backquotes ``` to quote multiple lines of code

like this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants