Skip to content

Managing software

khembach edited this page Apr 25, 2019 · 22 revisions

All software packages required by the workflow can be installed in conda environments, using the provided environment files (envs/environment.yaml and envs/environment_R.yaml).

The environment files contain version numbers of all the required software, reflecting the combination of software versions with which the workflow has been tested. In order to install and use a newer version of any of the software packages, modify the version numbers in the environment file accordingly.

Troubleshooting

If you run into problems with the software installation, please check our troubleshooting page for help.

How do you want to manage your software?

The diagram above indicates the three different ways in which you can manage the software for the ARMOR workflow, i.e. using conda, using a manually installed conda environment and a system R installation, and manually installing all required software. If you want to use a system R installation, you will have to adjust some parameters in the config.yaml and the .Renviron file (see "Using a system R installation" below).

You can manage the software in the following 3 different ways:

1. Using conda to install software (recommended)

First, ensure that conda is available and, if necessary, add the channels r, conda-forge and bioconda (see e.g. here).

Also make sure Snakemake is installed. You can use conda install -c bioconda -c conda-forge snakemake or do a global installation as described here.

1a. Using conda to also install R and all required R packages (recommended).

We recommend to install all software including R and the required R packages with conda.

The setup Snakemake rule can be used to create the required conda environments and install the necessary R packages:

snakemake --use-conda setup

1b. Using a system R installation

You can use your system R installation (see below) and still manage all other software with conda. You have to configure your system R as described below ("Using a system R installation").

Then, you can create the conda environment and install the required software and R packages with

snakemake --use-conda setup

2. Manually creating a conda environment

You can use your own R installation (see below) in combination with a manually created conda environment to run all the pre-processing steps. This might be useful if you are planning to analyze multiple datasets, because the installation of the required R packages in the R conda environment takes a long time. You can manually create a conda environment with

conda env create --name ARMOR --file envs/environment.yaml

And activate it with

conda activate ARMOR, or source activate ARMOR for conda versions under 4.6.

After configuring your system R as described below (needed for this run-mode), from within the environment (ARMOR) install the required R packages with

snakemake setup

Make sure you are in the environment with conda info --envs. To exit the environment use conda deactivate.

For more details on managing environments see here.

Using a system R installation

In some situations (e.g., Mac OS X), installing R packages inside conda R is difficult. In these cases or if one simply prefers to use a system R installation, the following modifications can be used:

  • Change the values of the useCondaR variable in the config.yaml to False.
  • Set the path to the system R binary in the Rbin variable in config.yaml.
  • Set the path to the desired R library in the .Renviron file. This can be an existing library directory, or an empty directory. If packages need to be installed, write access to this directory is needed.
  • In case the library directory defined in the previous step does not exist, create it.

NOTE: You can use your own R with either of the 3 run-modes, but it is necessary for run-modes 1b, 2 and 3.

3. Installing software manually

If you don't want to use conda, you have to make sure that all necessary software is installed and available in the path. The following software is used by the workflow:

You also need to make sure that all necessary R packages are installed. The workflow uses the following packages:

This list of software and packages can also be found at envs/environment.yaml and scripts/install_pkgs.R, respectively.

Summary

This diagram shows the three different run-modes explained above. In the "Requirements" boxes on the left, the required files and software are highlighted in yellow. Specific parameters that need to be adjusted are listed with each file. The green highlighted parts are the path to your system R binary and the desired R library. These paths are specific for your setup and need to be filled out accordingly! The boxes in the middle column list the code that needs to be run in order to install the software and to run the pipeline.

Checking software versions

The workflow contains two rules to check the versions of the software that has been used. To check the versions of R packages, running snakemake --use-conda listpackages (or snakemake listpackages if you are managing the software separately) will parse the output files generated by R CMD BATCH and extract all used R packages. The results will be written to a text file. To check the versions of other software, running snakemake --use-conda softwareversions (or snakemake softwareversions) will check the versions of the software. Finally, the log subdirectory of the specified output directory contains log files, which should state the version of all software that was used.

Operating system compatibility

ARMOR has been tested on macOS and Linux systems.