diff --git a/README.md b/README.md index 1d04055..1f4e6e6 100755 --- a/README.md +++ b/README.md @@ -61,7 +61,7 @@ mamba install --only-deps -c ursky metawrap-mg # `conda install --only-deps -c ursky metawrap-mg` also works, but much slower # OR -mamba install biopython blas=2.5 blast=2.6.0 bmtagger bowtie2 bwa checkm-genome fastqc kraken=1.1 kraken=2.0 krona=2.7 matplotlib maxbin2 megahit metabat2 pandas prokka quast r-ggplot2 r-recommended salmon samtools=1.9 seaborn spades trim-galore +mamba install biopython blas=2.5 blast=2.6.0 bmtagger bowtie2 bwa checkm-genome fastqc kraken=1.1 kraken=2.0 krona=2.7 matplotlib maxbin2 megahit metabat2 pandas prokka quast r-ggplot2 r-recommended salmon samtools=1.9 seaborn spades trim-galore minimap2 # Note: this last solution is more universal, but you may need to manually install concoct=1.0 and pplacer. ``` diff --git a/Usage_tutorial.md b/Usage_tutorial.md index 6c2608b..1fdd6a3 100644 --- a/Usage_tutorial.md +++ b/Usage_tutorial.md @@ -222,6 +222,32 @@ Note: This graph no longer has `Binning_refiner` in it, to reduce confusion. If As you can see, the refinment process signifficantly produced the best bin set in terms of both compleiton and contamination. Keep in mind that these improvements are even more dramatic in more complex samples. +## Step 5.5 (optional). Cleaning bins with MDMcleaner + +For cleaning the bins you should install [MDMcleaner](https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkac294/6583244?login=false) tool in separate conda environment, following direction on their github [page](https://github.com/KIT-IBG-5/mdmcleaner). +### Installation +In short installation via conda follows next steps: +``` +# Make conda environment +conda create -n mdmcleaner +conda activate mdmcleaner + +# Run the installation +conda install -c bioconda mdmcleaner +``` +After installation, some databases for MDMcleaner should be downloaded as well. **It takes a long time. For more options see tool ReadME**: +``` +mdmcleaner.py makedb -o MDMCLEANER_DB_FOLDER +``` +And don't forget to add the folder to the configuration file: +``` +mdmcleaner.py set_configs --db_basedir PATH_TO_MDMCLEANER_DB_FOLDER +``` +### Usage +The simple use case of MDMcleaner is the follows: +``` +mdmcleaner clean -i $(ls BIN_REFINEMENT/metawrap_50_10_bins) -o BIN_REFINEMENT_MDMCLEANER +``` ## Step 6: Visualize the community and the extracted bins with the Blobology module Lets use the Blobology module to project the entire assembly onto a GC vs Abundance plane, and annote them with taxonomy and bin information. This will not only give us an idea of what these microbial communities are structured like, but will also show us our binning success in a more visual way. @@ -286,6 +312,12 @@ Let us run the Reassemble_bins module with all the reads we have: metawrap reassemble_bins -o BIN_REASSEMBLY -1 CLEAN_READS/ALL_READS_1.fastq -2 CLEAN_READS/ALL_READS_2.fastq -t 96 -m 800 -c 50 -x 10 -b BIN_REFINEMENT/metawrap_50_10_bins ``` +If you have nanopore reads you can supply them with `--nanopore` flag. Also if bins are a result of MDMcleaner you should be able just point to the result folder of the program, adding `--mdmcleaner` flag as well. + +``` +metawrap reassemble_bins -o BIN_REASSEMBLY -1 CLEAN_READS/ALL_READS_1.fastq -2 CLEAN_READS/ALL_READS_2.fastq --nanopore CLEAN_READS/NANOPORE_READS.fastq -t 96 -m 800 -c 50 -x 10 -b BIN_REFINEMENT_MDMCLEANER --mdmcleaner +``` + Looking at the output in `BIN_REASSEMBLY/reassembled_bins.stats`, we can see that 3 bins were improved though strict reassembly, 6 improved thorugh permissive reassembly, and 4 bins could not be improved (`.strict`, `.permissive`, and `.orig` bin extensions, respectively): ``` bin completeness contamination GC lineage N50 size binner diff --git a/conda_pkg/meta.yaml b/conda_pkg/meta.yaml index d2efcec..82717c3 100644 --- a/conda_pkg/meta.yaml +++ b/conda_pkg/meta.yaml @@ -40,7 +40,7 @@ requirements: - spades 3.13.0 - taxator-tk 1.3.3e - trim-galore 0.5.0 - + - minimap2 2.24 about: home: https://github.com/ursky/metaWRAP license: MIT diff --git a/installation/dependancies.md b/installation/dependancies.md index 1848269..34b6690 100644 --- a/installation/dependancies.md +++ b/installation/dependancies.md @@ -24,7 +24,7 @@ - salmon - taxator-tk - prokka - +- minimap2 2.24 ## More detailed dependancies: @@ -55,6 +55,9 @@ - checkm_DB (standard) - SPAdes v3.10.1 +#### bin_reassembly +- minimap2 2.24 + ### quant_bins - salmon - seaborn 0.8.1