Skip to content

Commit

Permalink
New TSSlogo function
Browse files Browse the repository at this point in the history
Closes #90

This function introduces an optional dependency on ggseqlogo.
  • Loading branch information
charles-plessy committed Jul 18, 2023
1 parent 5d8dc67 commit 37bdc2e
Show file tree
Hide file tree
Showing 11 changed files with 167 additions and 0 deletions.
2 changes: 2 additions & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ Suggests:
BSgenome.Drerio.UCSC.danRer7,
DESeq2,
FANTOM3and4CAGE,
ggseqlogo,
BiocStyle,
knitr,
rmarkdown
Expand Down Expand Up @@ -84,6 +85,7 @@ Collate:
'ShiftingFunctions.R'
'ShiftingMethods.R'
'StrandInvaders.R'
'TSSlogo.R'
LazyData: true
VignetteBuilder: knitr
RoxygenNote: 7.2.1
Expand Down
2 changes: 2 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ export(CTSStoGenes)
export(CustomConsensusClusters)
export(GeneExpDESeq2)
export(GeneExpSE)
export(TSSlogo)
export(aggregateTagClusters)
export(annotateCTSS)
export(annotateConsensusClusters)
Expand Down Expand Up @@ -89,6 +90,7 @@ importFrom(BiocParallel,MulticoreParam)
importFrom(BiocParallel,SerialParam)
importFrom(BiocParallel,bplapply)
importFrom(BiocParallel,multicoreWorkers)
importFrom(Biostrings,consensusMatrix)
importFrom(CAGEfightR,quickEnhancers)
importFrom(GenomeInfoDb,"genome<-")
importFrom(GenomeInfoDb,"seqlevels<-")
Expand Down
87 changes: 87 additions & 0 deletions R/TSSlogo.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
#' TSS logo
#'
#' Plot the sequence logo of the region flanking the TSS. When this function
#' is given _tag clusters_ or _consensus clusters_, it uses the _dominant peak_
#' as the transcription start site.
#'
#' This function will only work if the [`CAGEexp`] object was built with a
#' [`BSgenome`] package, as it needs to extract genomic sequences.
#'
#' @param x A [`CTSS`], a [`TagClusters`] or a [`ConsensusClusters`] object.
#' @param upstream Number of bases to plot upstream the TSS.
#' @param downstream Number of bases to plot downstream the TSS, including the
#' TSS itself.
#'
#' @returns A [`ggplot2::ggplot`] object showing the sequence logo. The
#' coordinates displayed are negative for upstream sequences and positive
#' downstream. The position of the TSS is set to 1.
#'
#' @family CAGEr plot functions
#' @family CAGEr TSS functions
#'
#' @author Charles Plessy
#'
#' @examples
#' TSSlogo(exampleCAGEexp|>consensusClustersGR(), 20, 10)
#'
#' @importFrom Biostrings consensusMatrix
#'
#' @export

setGeneric("TSSlogo",
function (x, upstream = 10, downstream = 10)
standardGeneric("TSSlogo"))

#' @rdname TSSlogo

setMethod("TSSlogo", signature("CAGEexp"),
function (x, upstream = 10, downstream = 10) {
TSSlogo(CTSScoordinatesGR(object), upstream = upstream, downstream = downstream)
})

#' @rdname TSSlogo

setMethod("TSSlogo", signature("TagClusters"),
function (x, upstream = 10, downstream = 10) {
TSSlogo(object$dominant_ctss |> as("CTSS"), upstream = upstream, downstream = downstream)
})

setMethod("TSSlogo", signature("ConsensusClusters"),
function (x, upstream = 10, downstream = 10) {
TSSlogo(object$dominant_ctss |> as("CTSS"), upstream = upstream, downstream = downstream)
})

#' @rdname TSSlogo

setMethod("TSSlogo", signature("CTSS"),
function (x, upstream = 10, downstream = 10) {
.TSSlogo(object, upstream = upstream, downstream = downstream)
})

.TSSlogo <- function(x, upstream=10, downstream=10) {
if (! requireNamespace("ggseqlogo"))
stop("This function requires the ", dQuote("ggseqlogo"), " package; please install it.")
# Extract sequences
upstreamRanges <-
promoters(x = x, upstream = upstream, downstream = downstream) |>
suppressWarnings() # This warns about off-genome coordinates, but we will fix this below.

# Discard ranges that would be trimmed
upstreamRanges_trimmed <- upstreamRanges |> trim()
upstreamRanges <- upstreamRanges[width(upstreamRanges) == width(upstreamRanges_trimmed)]

# Extract sequences
bsgenome <- getBSgenome(unique(genome(x)))
upstreamSeq <- getSeq(bsgenome, upstreamRanges)

# Plot sequence logo
letter_counts <- consensusMatrix(upstreamSeq)
probs <- prop.table(letter_counts[1:4,], 2)
gg <- ggseqlogo::ggseqlogo(probs)
# Circumvent "Scale for x is already present." warning.
gg$scales$scales[[2]]$breaks <- seq( 1 , upstream + downstream, by = 1)
gg$scales$scales[[2]]$labels <- seq(1 - upstream, downstream, by = 1)
gg$scales$scales[[1]]$breaks <- seq( 1 , upstream + downstream, by = 1)
gg$scales$scales[[1]]$labels <- seq(1 - upstream, downstream, by = 1)
gg
}
57 changes: 57 additions & 0 deletions man/TSSlogo.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/hanabiPlot.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/plotAnnot.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/plotCorrelation.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/plotExpressionProfiles.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/plotInterquantileWidth.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/plotReverseCumulatives.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

13 changes: 13 additions & 0 deletions vignettes/CAGEexp.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -340,6 +340,19 @@ corr.m <- plotCorrelation2( ce, samples = "all"
, method = "pearson")
```

### Sequence distribution at the transcription start site.

The presence of the core promoter motifs can be checked with the `TSSlogo()`
function, provided that the _CAGEexp_ object was built with a _BSgenome_
package allowing access to the sequence flanking the transcription start sites.

```{r TSSlogo, fig.cap="Sequence logo at the transcription start site"}
TSSlogo(CTSScoordinatesGR(ce) |> subset(annotation == "promoter"), upstream = 35)
```

The `TSSlogo()` function can also be used later. When passed _tag clusters_
or _consensus clusters_, it will automatically center the regions on their
_dominant peak_.

Merging of replicates
---------------------
Expand Down

0 comments on commit 37bdc2e

Please sign in to comment.