Post-processing & Lightweight Updates To pipeline Output
CWL files and workflows to accompany the helix_filters_01 repo. Supported by infrastructure in the pluto
submodule.
Clone this repo with
git clone --recursive https://github.com/mskcc/pluto-cwl.git
cd pluto-cwl
Install dependencies for the repo with the command:
make install
This will checkout the included git
submodules and install a local conda
with extra dependencies.
Use this command to activate the installed environment for running workflows:
. env.juno.sh toil
This will:
- update your environment to use the
cwltool
andtoil
installed in the localconda
- (if running on Juno HPC) update your environment with Toil variables needed to run on Juno
- (if running on Juno HPC) upate your environment to use pre-cached Singularity containers located on Juno
The primary entry point for the workflow is cwl/workflow_with_facets.cwl
.
You can run a CWL included in this repo by using the wrapper scripts bundled in the pluto
submodule;
pluto/run-cwltool.sh
for simple use casespluto/run-toil.sh
if parallel processing and HPC (LSF) useage is required
Development and testing takes place via the test suite.
The included test suite can be run with:
make test
It typically takes about 45 minutes to run all included tests
- NOTE: tests require data sets that are pre-saved on the
juno
server
Some very large integration tests are skipped by default. To include all tests, export the environment variable LARGE_TESTS=True
or include it in the command line invocation. You can also change the CWL engine from cwltool
to toil
, among other settings, the same way. For example;
LARGE_TESTS=True CWL_ENGINE=Toil PRINT_COMMAND=True TMP_DIR=/scratch USE_LSF=True make test
Available environment variable settings are derived from the pluto.settings
submodule.
An extra recipe is included which can run the tests in parallel, for example to run 8 tests at once you can use this command:
make parallel-test
For development purposes, it is helpful to be able to run only a specific test case, or subset of tests.
You can run just the script with the tests you are interested in, such as;
python tests/test_workflow_cwl.py
You can further select which test case(s) from the script you wish to run by adding their labels as args;
python tests/test_workflow_cwl.py TestClassName
python tests/test_workflow_cwl.py TestClassName.test_function
This can be combined with the environment variables described above (such as LARGE_TESTS
, PRINT_COMMAND
, KEEP_TMP
, etc.).