-
Notifications
You must be signed in to change notification settings - Fork 34
Managing multiple projects
Here, we outline different ways to manage data and software if you want to run the ARMOR workflow on more than one project or data set:
-
Keep an ARMOR repository and the data of each project together in a single directory. In this way, the software (the
Snakefile
, the scripts, theRmd
files and theconfig.yaml
) and data from each project are contained in a single directory. The configuration of the workflow will be physically separated for each project and thus, it will be easy to reproduce results. However, you will have ARMOR in multiple physical locations, which means the installed software will be duplicated if you are using the--use-conda
option, because it makes a conda environment in that directory (e.g.,ARMOR/.snakemake/conda/7a4f9e69
). -
Clone the ARMOR repository only once and have a separate directory for each project. In this way, the ARMOR directory can be reused for many different projects. This might be useful if you do not want to recreate conda environments for each project and will be using the same
Snakefile
and scripts for every project. In this case, you will need a differentconfig.yaml
file for each project (either in the ARMOR directory or in each project directory.). You will have to specify the path to theconfig.yaml
file every time you want to run the workflow (e.g.,snakemake --configfile projectX/config.yaml
). -
Do not update ARMOR in the middle of an analysis. Not only does this ensure reproducibility, but it avoids dependency clashes, and incompatible version mixing.
Further details can be found at the Running the analysis page.