From ab4e620356806c3631d39bc0d3f19b1c4c5c115e Mon Sep 17 00:00:00 2001 From: Yaomin Xu Date: Fri, 3 Jan 2014 17:54:24 -0600 Subject: [PATCH] Release 1.0 - updated according to the feedback from CRAN --- DESCRIPTION | 13 +++++++---- R/sb-expression-data.r | 6 ++---- R/sb-mutation-data.r | 3 +-- README.md | 15 ++++++++++--- man/DESnowball-package.Rd | 45 ++++++++++++++++++++++++--------------- man/sb.expression.Rd | 21 ++++++++++++------ man/sb.mutation.Rd | 17 +++++++++++---- man/select.features.Rd | 6 ++++++ man/snowball.Rd | 8 +++---- man/toplist.Rd | 8 +++---- 10 files changed, 93 insertions(+), 49 deletions(-) diff --git a/DESCRIPTION b/DESCRIPTION index 0c8bb78..b9fe28e 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,8 +1,9 @@ Package: DESnowball Type: Package -Title: Bagging with Distance-based Regression for Differential Gene Expression Analyses +Title: Bagging with Distance-based Regression for Differential Gene Expression + Analyses Version: 1.0 -Date: 2013-07-20 +Date: 2014-1-2 Author: Yaomin Xu Maintainer: Yaomin Xu Depends: @@ -13,6 +14,10 @@ Imports: MASS, parallel, cluster -Description: This package implements a statistical data mining method to compare whole genome gene expression - profiles with respect to the presence of a recurrent genetic disturbance event to identify the affected target genes. +Description: This package implements a statistical data mining method to + compare whole genome gene expression profiles, with respect to the presence + of a recurrent genetic disturbance event, to identify the affected target + genes. License: GPL-3 +URL: https://github.com/snowball-project/DESnowball +BugReports: https://github.com/snowball-project/DESnowball/issues diff --git a/R/sb-expression-data.r b/R/sb-expression-data.r index 30fb05d..67c08e5 100644 --- a/R/sb-expression-data.r +++ b/R/sb-expression-data.r @@ -1,8 +1,6 @@ #' Gene expression data of 14 patients - -#' A demo dataset containing 6597 gene expression profiles on 14 patients, the corresponding -#' mutation status is provided in \code{\link{sb.mutation}} - +#' +#' A demo dataset containing 6597 gene expression profiles on 14 patients, the corresponding mutation status is provided in \code{\link{sb.mutation}} #' @docType data #' @keywords datasets #' @format A data.frame with 6597 rows and 14 variables diff --git a/R/sb-mutation-data.r b/R/sb-mutation-data.r index f0db203..50507dd 100644 --- a/R/sb-mutation-data.r +++ b/R/sb-mutation-data.r @@ -1,7 +1,6 @@ #' Mutation status of 14 patients - +#' #' A character vector indicating the mutation status of 14 patients - #' @docType data #' @keywords datasets #' @format A character vector of 14 elements diff --git a/README.md b/README.md index c77f128..021f2b5 100644 --- a/README.md +++ b/README.md @@ -3,11 +3,11 @@ ## About The DESnowball package implements a statistical data mining method that compares the whole genome gene expression profiles with respect to the presence of a recurrent genetic disturbance event ( -e.g. a recurrent driver mutation) to identify the target genes affected by the event. +e.g. a recurrent driver mutation) to identify the affected target genes. -The input data for the snowball analysis are the profiling of the whole genome gene expression profiles +The input data for the snowball analysis are the whole genome gene expression profiles and the mutation status of a recurrent genetic event on a group of samples. The analysis has -been tested on the TCGA primary tumor samples. The minimum sample size required per group is three. +been tested on the TCGA melanoma primary tumor samples. The minimum sample size required per group is three. ## Installation From R: @@ -32,3 +32,12 @@ Example: snowball analysis on the demo dataset included in the package plotJn(sb, sb.sel) # get the significant gene list top.genes <- toplist(sb.sel) +## References +Xu, Y. and Sun, J. (2005) PfCluster: a new cluster analysis procedure for gene expression profiles. Presented at a conference on Nonparametric Inference and Probability With Applications to Science honoring Michael Woodroofe; September 24-25, 2005; Ann Arbor, Mich, 2005. + +McArdlei, B.H. and Anderson, M.J. (2001) Fitting multivariate models to community data: A comment on distance-based redundancy analysis. Ecology 82(1): 290-297. + +Xu, Y., Guo, X., Sun, J. and Zhao. Z. Snowball: resampling combined with distance-based regression to discover transcriptional consequences of driver mutation, manuscript. + +Guo, X., Xu, Y. and Zhao, Z.. Driver mutation BRAF regulates cell proliferation and apoptosis via MITF in the pathogenesis of melanoma, manuscript. + diff --git a/man/DESnowball-package.Rd b/man/DESnowball-package.Rd index 2394066..b4677ba 100644 --- a/man/DESnowball-package.Rd +++ b/man/DESnowball-package.Rd @@ -4,22 +4,23 @@ \title{A R package implemented Snowball approach (see references)} \description{ Genome-wide differential gene expression analysis with -respect to the presence of a recurrent driver mutation +respect to the presence of a recurrent genetic disturbance +(a driver mutation) } \details{ -The DESnowball package implements the Snowball approach -(see references). It is a differential gene expression -analysis tool that compares the whole genome gene -expression profiles measured on tumor samples with vs. -without a recurrent driver mutation. +The DESnowball package implements a differential gene +expression analysis tool that compares the whole genome +gene expression profiles on samples relative to the +presence of a recurrent genetic disturbance (driver +mutation). The input data for the snowball analysis are the profiling of the whole genome gene expression and the mutation status -of a recurrent driver mutation on a group of patient -samples. The analysis has been tested on the primary tumor -samples and the minimum sample size required per group is -three. Snowball does not require a balanced design between -groups (see references). +of a recurrent genetic event on a group of samples. The +analysis has been tested on human primary tumor samples and +the minimum sample size required per group is three. +Snowball does not require a balanced design between groups +(see references). The main function of the package is \code{\link{snowball}}, it requires two input data, named \code{y} and \code{X}, @@ -45,13 +46,23 @@ and \code{\link{toplist}} to report the top genes based on the user provided cutoff. } \references{ -Yaomin Xu, Xingyi Guo, Jiayang Sun, Zhongming Zhao. -Snowball: resampling combined with distance-based -regression to discover transcriptional consequences of -driver mutation (submitted) +Xu, Y. and Sun, J. (2005) PfCluster: a new cluster analysis +procedure for gene expression profiles. Presented at a +conference on Nonparametric Inference and Probability With +Applications to Science honoring Michael Woodroofe; +September 24-25, 2005; Ann Arbor, Mich, 2005. -Xingyi Guo, Yaomin Xu, Zhongming Zhao. Driver mutation BRAF +McArdlei, B.H. and Anderson, M.J. (2001) Fitting +multivariate models to community data: A comment on +distance-based redundancy analysis. Ecology 82(1): 290-297. + +Xu, Y., Guo, X., Sun, J. and Zhao. Z. Snowball: resampling +combined with distance-based regression to discover +transcriptional consequences of driver mutation, +manuscript. + +Guo, X., Xu, Y. and Zhao, Z.. Driver mutation BRAF regulates cell proliferation and apoptosis via MITF in the -pathogenesis of melanoma (submitted) +pathogenesis of melanoma, manuscript. } diff --git a/man/sb.expression.Rd b/man/sb.expression.Rd index f694f88..279daa8 100644 --- a/man/sb.expression.Rd +++ b/man/sb.expression.Rd @@ -1,15 +1,22 @@ \docType{data} \name{sb.expression} \alias{sb.expression} -\title{Gene expression data of 14 patients -A demo dataset containing 6597 gene expression profiles on 14 patients, the corresponding -mutation status is provided in \code{\link{sb.mutation}}} +\title{Gene expression data of 14 patients} \format{A data.frame with 6597 rows and 14 variables} \description{ -Gene expression data of 14 patients A demo dataset -containing 6597 gene expression profiles on 14 patients, -the corresponding mutation status is provided in -\code{\link{sb.mutation}} +A demo dataset containing 6597 gene expression profiles on +14 patients, the corresponding mutation status is provided +in \code{\link{sb.mutation}} +} +\references{ +Xu, Y., Guo, X., Sun, J. and Zhao. Z. Snowball: resampling +combined with distance-based regression to discover +transcriptional consequences of driver mutation, +manuscript. + +Guo, X., Xu, Y. and Zhao, Z.. Driver mutation BRAF +regulates cell proliferation and apoptosis via MITF in the +pathogenesis of melanoma, manuscript. } \keyword{datasets} diff --git a/man/sb.mutation.Rd b/man/sb.mutation.Rd index 3de8711..8f416a6 100644 --- a/man/sb.mutation.Rd +++ b/man/sb.mutation.Rd @@ -1,12 +1,21 @@ \docType{data} \name{sb.mutation} \alias{sb.mutation} -\title{Mutation status of 14 patients -A character vector indicating the mutation status of 14 patients} +\title{Mutation status of 14 patients} \format{A character vector of 14 elements} \description{ -Mutation status of 14 patients A character vector -indicating the mutation status of 14 patients +A character vector indicating the mutation status of 14 +patients +} +\references{ +Xu, Y., Guo, X., Sun, J. and Zhao. Z. Snowball: resampling +combined with distance-based regression to discover +transcriptional consequences of driver mutation, +manuscript. + +Guo, X., Xu, Y. and Zhao, Z.. Driver mutation BRAF +regulates cell proliferation and apoptosis via MITF in the +pathogenesis of melanoma, manuscript. } \keyword{datasets} diff --git a/man/select.features.Rd b/man/select.features.Rd index e855942..3dc5bf2 100644 --- a/man/select.features.Rd +++ b/man/select.features.Rd @@ -31,4 +31,10 @@ Gene selection based on the statistical significances according to the Snowball approach (see references for more details). } +\references{ +Xu, Y., Guo, X., Sun, J. and Zhao. Z. Snowball: resampling +combined with distance-based regression to discover +transcriptional consequences of driver mutation, +manuscript. +} diff --git a/man/snowball.Rd b/man/snowball.Rd index 6a580a6..8b65e0b 100644 --- a/man/snowball.Rd +++ b/man/snowball.Rd @@ -148,9 +148,9 @@ top.genes <- toplist(sb.sel) } } \references{ -Yaomin Xu, Xingyi Guo, Jiayang Sun, Zhongming Zhao. -Snowball: Resampling combined with distance-based -regression to discover transcriptional consequences of -driver mutation (submitted) +Xu, Y., Guo, X., Sun, J. and Zhao. Z. Snowball: resampling +combined with distance-based regression to discover +transcriptional consequences of driver mutation, +manuscript. } diff --git a/man/toplist.Rd b/man/toplist.Rd index 338ac58..ba6083b 100644 --- a/man/toplist.Rd +++ b/man/toplist.Rd @@ -16,9 +16,9 @@ a data.frame with two columns \code{RD} and \code{pvalue} Report the top list based on p values. } \references{ -Yaomin Xu, Xingyi Guo, Jiayang Sun, Zhongming Zhao. -Snowball: Resampling combined with distance-based -regression to discover transcriptional consequences of -driver mutation (submitted) +Xu, Y., Guo, X., Sun, J. and Zhao. Z. Snowball: resampling +combined with distance-based regression to discover +transcriptional consequences of driver mutation, +manuscript. }