Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Give different names to plus and minus strand when exporting to BedGraph #56

Open
Hami7407 opened this issue Jan 16, 2022 · 6 comments
Open
Labels
bug regression This worked in a previous release.

Comments

@Hami7407
Copy link

Hello, I am trying to export to the BedGraph File.

However, I only receive tag cluster file not divided into two strands.

This is what I did

> trk <- exportToTrack(CTSSnormalizedTpmGR(humanS_cluster, "Adult_brain_1S"))
> humanS_cluster |> CTSSnormalizedTpmGR("all") |> exportToTrack(humanS_cluster, oneTrack = FALSE)
GRangesList object of length 2:
[[1]]
UCSC track 'Adult_brain_1S (TC)'
UCSCData object with 1310873 ranges and 5 metadata columns:
            seqnames    ranges strand |    genes annotation filteredCTSSidx     score     itemRgb
               <Rle> <IRanges>  <Rle> |    <Rle>      <Rle>           <Rle> <numeric> <character>
        [1]     chr1    564451      + | MTND1P23   promoter            TRUE   3.05356       black
        [2]     chr1    564455      + | MTND1P23   promoter            TRUE   1.56056       black
        [3]     chr1    564456      + | MTND1P23   promoter            TRUE   3.05356       black
        [4]     chr1    564463      + | MTND1P23   promoter            TRUE   5.97491       black
        [5]     chr1    564560      + | MTND1P23   promoter            TRUE   1.56056       black
        ...      ...       ...    ... .      ...        ...             ...       ...         ...
  [1310869]     chr1 249211204      - |             unknown            TRUE   1.56056       black
  [1310870]     chr1 249212177      - |             unknown            TRUE   1.56056       black
  [1310871]     chr1 249220686      - |             unknown            TRUE   1.56056       black
  [1310872]     chr1 249221821      - |             unknown            TRUE   1.56056       black
  [1310873]     chr1 249239784      - |             unknown            TRUE   1.56056       black
  -------
  seqinfo: 298 sequences (2 circular) from hg19 genome
[[2]]
UCSC track 'Adult_brain_1S (TC)'
UCSCData object with 502786 ranges and 5 metadata columns:
           seqnames    ranges strand |    genes annotation filteredCTSSidx     score     itemRgb
              <Rle> <IRanges>  <Rle> |    <Rle>      <Rle>           <Rle> <numeric> <character>
       [1]     chr1     82726      + |             unknown            TRUE   3.68027       black
       [2]     chr1    535277      + |             unknown            TRUE   3.68027       black
       [3]     chr1    540765      + |             unknown            TRUE   3.68027       black
       [4]     chr1    564575      + | MTND1P23   promoter            TRUE   3.68027       black
       [5]     chr1    564587      + | MTND1P23   promoter            TRUE  11.45750       black
       ...      ...       ...    ... .      ...        ...             ...       ...         ...
  [502782]     chr1 249200594      - |             unknown            TRUE   3.68027       black
  [502783]     chr1 249200611      - |             unknown            TRUE   3.68027       black
  [502784]     chr1 249200695      - |             unknown            TRUE   3.68027       black
  [502785]     chr1 249200790      - |             unknown            TRUE   3.68027       black
  [502786]     chr1 249201330      - |             unknown            TRUE   3.68027       black
  -------
  seqinfo: 298 sequences (2 circular) from hg19 genome

> trk <- split(trk, strand(trk), drop = TRUE)
> rtracklayer::export.bedGraph(trk, "Adult_brain_1S")
BiocFileList of length 2

Same thing happens when I used rtracklayer::export.bedGraph(trk, "Adult_brain_1S.bedGraph")

How can I get a bedGraph file of two different strands?

Thank you

@charles-plessy
Copy link
Owner

Dear Hami,

in my hands, with the example data from the vignette, the command works properly and produces one file containing two tracks. However, the tracks have the same name; is that the cause of your problem?

Have a nice day,

--
Charles

@Hami7407
Copy link
Author

image

hmm... I only can download CAGE Tag cluster file which are not divided into two strands even though I export the data after split them into two. When I open the file, it has only positive numbers.

I thought I don't have to specify which strand I want to download if I use export.bedGraph command.
Could you help me with this? It was all working with the CAGEr previous version.

Thank you,

Best

@charles-plessy
Copy link
Owner

Can you try something like:

trkBG <- split(trk, strand(trk), drop = TRUE)
trkBG[['+']]@trackLine@name <- paste0(trkBG[['+']]@trackLine@name, " +")
trkBG[['-']]@trackLine@name <- paste0(trkBG[['-']]@trackLine@name, " -")
trkBG[['+']]@trackLine@description <- paste0(trkBG[['+']]@trackLine@description, ", + strand")
trkBG[['-']]@trackLine@description <- paste0(trkBG[['-']]@trackLine@description, ", - strand")
rtracklayer::export.bedGraph(trkBG, "myBedGraphTrack.bedGraph")

If it works I will correct the documentation and try to add a function for such strand splitting.

@Hami7407
Copy link
Author

Thank you for the suggestion!
However, it only gives me + strand... somehow I can't download minus strand
image

Sorry for the issues!

@charles-plessy
Copy link
Owner

Can you double-check that the minus-strand information is present in the data you send to the UCSC browser? On my computer, with CAGEr's example data I have:

$ grep track myBedGraphTrack.bedGraph 
track name="Zf.30p.dome (TC)+" description="Zf.30p.dome (CAGE Tag Clusters (TC))" visibility=full type=bedGraph
track name="Zf.30p.dome (TC)-" description="Zf.30p.dome (CAGE Tag Clusters (TC))" visibility=full type=bedGraph

Also, if I remember well the previous version of CAGEr was exporting the plus and minus strand in separate files. You can still do that too with something like:

rtracklayer::export.bedGraph(trkBG[['+']], "myBedGraphTrackPlus.bedGraph")
rtracklayer::export.bedGraph(trkBG[['-']], "myBedGraphTrackMinus.bedGraph")

@Hami7407
Copy link
Author

trkBG <- split(trk, strand(trk), drop = TRUE) trkBG[['+']]@trackLine@name <- paste0(trkBG[['+']]@trackLine@name, " +") trkBG[['-']]@trackLine@name <- paste0(trkBG[['-']]@trackLine@name, " -") trkBG[['+']]@trackLine@description <- paste0(trkBG[['+']]@trackLine@description, ", + strand") trkBG[['-']]@trackLine@description <- paste0(trkBG[['-']]@trackLine@description, ", - strand") rtracklayer::export.bedGraph(trkBG, "myBedGraphTrack.bedGraph")

Sorry, I checked again with UCSC browser and minus strand exists! so the last code you gave me works!

Thank you

@charles-plessy charles-plessy added bug regression This worked in a previous release. labels Jan 18, 2022
@charles-plessy charles-plessy changed the title Export to the BedGraph file Give different names to plus and minus strand when exporting to BedGraph Jan 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug regression This worked in a previous release.
Projects
None yet
Development

No branches or pull requests

2 participants