Skip to content

Multitask Learning for Transcriptional Regulatory Network Inference in Julia

Notifications You must be signed in to change notification settings

PeterDeWeirdt/MTL_JULIA

Repository files navigation

MTL_JULIA

MTL_JULIA is a pipeline for transcriptional regulatory network (TRN) inference using multitask learning (MTL) written in MATLAB and julia.

References

  1. Castro, Dayanne M., et al. "Multi-study inference of regulatory networks for more accurate models of gene regulation." bioRxiv (2018): 279224.
  2. Miraldi, Emily R., et al. "Leveraging chromatin accessibility for transcriptional regulatory network inference in T Helper 17 Cells." bioRxiv (2018): 292987.

Highlights

  1. Written in Julia for speed. For reference, TRN inference on the Th17 dataset took 37 minutes on a macbook pro running on 6 core processors.
  2. Parallel implementation
  3. Parameter selection with EBIC or cross validation

Installation

  1. You need a licensed version of MATLAB
  2. Download JuliaPro Version 0.6.3 or greater (there are other download options too)
  3. Download this github repository
  4. From julia run "Add_packages.jl" to install julia packages

Th17 Example

Interactive (recomended for first run)

First we use MATLAB for Transcription factor estimation and prior matrix creation.

  1. Open "Th17example_setup.m"
  2. Set options in the script - if you would like to run MATLAB serially:
parallel = false;
  1. Run "Th17example_setup.m" Now we use the outputs from MATLAB for network inference in Julia. Note: Julia reads the filepaths for the MATLAB outputs from "setup.txt" in the setup folder, so no user specification is necessary
  2. Open "Th17example_inference.jl"
  3. Set options in the script - if you would like to run Julia serially:
parallel = false

or with a different number of processors:

Nprocs = 2 

If you would like to check the number of processors on your machine, in Julia you can type

Sys.CPU_CORES 

There are two main parameter selection strategies to choose from:

Extended Bayesian Information Criteria
Fit = :ebic
getFitsParallel(DataMatPaths, Fit, Smin, Smax, Ssteps, nB, TaskNames, FitsOutputDir,
        FitsOutputMat, tolerance = tolerance, useBlockPrior = useBlockPrior)
Cross Validation
Fit = :cv
nfolds = 2
getFitsParallel(DataMatPaths, Fit, Smin, Smax, Ssteps, nB, TaskNames, FitsOutputDir,
        FitsOutputMat, tolerance = tolerance, useBlockPrior = useBlockPrior, nfolds = nfolds)
  1. Run "Th17example_inference.jl"
  2. Check the outputs folder for outputs

Shell Script

The above steps are autonomized in the shell script "Th17example_MTLpipeline.sh." Before running, you will likely have to set the matlab and julia binary paths:

matlab="/Applications/path/to/bin/matlab"
julia="/Applications/path/to/bin/julia"