ReFACTor v1.0

Reference-Free Adjustment for Cell-Type composition (ReFACTor) is an unsupervised method for the correction of cell-type heterogeneity in epigenome-wide association studies (EWAS), which is based on a variant of principal component analysis (PCA). ReFACTor is described in the following paper (upcoming).
s
As decribed bellow, ReFACTor is available in R and in python. For users working with large datasets we recommend using the much faster python version.

Download

Press the 'Download ZIP' button on the right side of the screen
Extract the ZIP file to folder

Dependencies for the python version are desribed at the end of this file.

Input

ReFACTor takes the following arguments:

datafile: A data file of sites by samples matrix of beta-normalized methylation levels. The first row should contain the sample IDs and the first column should contain the CpG IDs (see demo_files/demo_datafile.txt for example)
k: the number of assumed cell types
t (optional): The number of sites to use for computing the ReFACtor components (default is 500)
numcomp (optional): The number of ReFACTor components to output (default is same as k)
out (optional): Prefix for the output files (default is 'refactor.')

Output

The software outputs two files:

refactor.out.components.txt - a matrix with the first several ReFACtor components for each individual
refactor.out.rankedlist.txt - a ranked list of the methylation sites; from the most informative to the least informative

R

To use the R version go to "R" folder (run "cd R")

you can run a demo or execute refactor with your own data

Demo

The following demo computes the ReFACTor components of a simulated example dataset and performs an EWAS.

Run from command line:

Rscript demo.R

Execution

refactor.R function can be executed within an R script as follows:

# <R code>
source("refactor.R")
k = 5
datafile = "../demo_files/demo_datafile.txt"
num_components = 5
t = 500
results <- refactor(datafile,k)
RC <- results$refactor_components # Extract the ReFACTor components
ranked_list <- results$ranked_list # Extract the list of sites ranked by ReFACTor

Python

To use the Python version go to "python" folder (run "cd python")

you can run a demo or execute refactor with your own data

Demo

The following demo computes the ReFACTor components of a simulated example dataset and performs an EWAS.

Run from command line:

python demo.py

Execution

Execute from the command line:

python refactor.py --datafile <datafile> --k <k>

or, if including the optional parameters:

python refactor.py --datafile <data_file> --k <k> --t<t> --numcomp <num_components> --out <out_prefix>

Parameters selection

Data preprocessing

Dependencies

For the python version we recommend using a standard python distribution such as Anaconda (https://www.continuum.io/downloads). This release of ReFACTor was implemented for python v2.7 and has the following dependencies:

numpy
scipy
sklearn
matplotlib (required only for the demo.py)
statsmodels (required only for the demo.py)

Citing ReFACTor

If you use ReFACTor in any published work, please cite the manuscript describing the method (upcoming).

Authors

This software was developed by Reut Yedidim, Noah Zaitlen and Elior Rahmani.

For reporting bugs and questions please email to Elior Rahmani at: [email protected]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

6

ReFACTor v1.0

Download

Input

Output

R

Demo

Execution

Python

Demo

Execution

Parameters selection

Data preprocessing

Dependencies

Citing ReFACTor

Authors