-
Notifications
You must be signed in to change notification settings - Fork 9
10. Post Processing
This page describes the post-processing and interpretation of results after running the MetScale workflows.
Following the completion of all analyses, a final post processing command has been incorporated to organize all datasets according to their sample name(s). The post_processing_move_samples_dir_workflow
snakemake rule will create sub-directories in the data/
directory and move all files associated with that sample into their respective sub-directories.
This can be executed with the following command:
snakemake --use-singularity --configfile=config/my_custom_config.json post_processing_move_samples_dir_workflow
Before executing this command, we recommend that users ensure they are completely finished with all analyses. Re-executing a workflow can't be done unless data files are moved back into the metagenomics/workflows/data/
directory.
Aside from the organizational benefits of moving the files for each sample into their own directory, we are also exploring options for generating final reports that could be executed on all files located within a single directory.
The following rules are available for execution in the post processing workflow (yellow stars indicate the terminal rules):
Once all your data has been moved with the post_processing_move_samples_dir_workflow
command, you can now generate a report off all your results. This will create an summary-report.html file
which includes detailed summaries of all your workflow analyses. If you have run any taxonomic analyses from the taxonomic_classifier workflows, then the report will also generate a graphical summary comparing the results across each tool.
First, you need to run the setup from the /workflow/post_processing/
directory. This only needs to be done once. If you accidentally run the set up again you will see on onscreen error message that Execution Halted
. This is to prevent overwriting any set up files. You will still be able to process and generate the final report.
The setup can be executed with the following command:
cd post_processing/
python setup_post_processing.py --input <path_to_data_directory> --post <path_to_post_processing_directory>
Depending on the setup of your directories, those paths may look like this:
python setup_post_processing.py --input ~/metscale/workflows/data/ --post ~/metscale/workflows/post_processing/
You may then execute the command to generate the final report from the metscale/workflows
directory. This can be executed with the following commands:
cd ..
snakemake --use-singularity --configfile=config/my_custom_config.json post_processing_create_final_report_workflow
Once this has successfully run, you should see a summary-report.html
file in your samples sub directory (e.g. /data/<samples_name>_finished/). Some of the features in the final report require the tool output files to be present in the same folder as the summary report html file when viewing the results. If you have run any taxonomic analyses, you will also see an abundance_graph.png
file, summarizing the relative abundances off the taxa identified across all the taxonomic classifier tools.
For each tool, you will see a color indicating the signal of each species identified on the "Y" axis color-coded by the following:
Color | Species Signal | KrakenUniq kmers | Kraken2 reads | Bracken reads | Kaiju reads | Sourmash f match | Mash identity |
---|---|---|---|---|---|---|---|
Red | Very Strong | >10,000 | >100,000 | >100,000 | >100,000 | >0.60 | >0.95 |
Orange | Strong | 5,000-10,000 | 30,000-100,000 | 30,000-100,000 | 30,000-100,000 | 0.20-0.60 | 0.90-0.95 |
Yellow | Moderately Strong | 2,000-5,000 | 10,000-30,000 | 10,000-30,000 | 10,000-30,000 | 0.15-0.20 | 0.85-0.90 |
Green | Moderate | 1,000-2,000 | 1,000-10,000 | 1,000-10,000 | 1,000-10,000 | 0.10-0.15 | 0.80-0.85 |
Blue | Weak | 500-1,000 | 100-1,000 | 100-1,000 | 100-1,000 | 0.05-0.10 | 0.75-0.80 |
Grey | Very Weak | 0-500 | 0-100 | 0-100 | 0-100 | 0-0.05 | 0-0.75 |
White | No Species Signal | 0 | 0 | 0 | 0 | 0 | 0 |