Skip to content

Latest commit

 

History

History
327 lines (262 loc) · 11.2 KB

CHANGELOG.md

File metadata and controls

327 lines (262 loc) · 11.2 KB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[v2.10.1]

Fixed

  • Fix the json.decoder.JSONDecodeError when running with --amr.

[v2.10.0]

Changed

  • The workflow now uses the fastcat read length and quality histograms instead of the per-read stats in the report process.

Added

  • Output IGV configuration file if the keep_bam option is enabled and a custom reference is provided (in minimap2 mode).
  • Output reduced reference file if the keep_bam option is enabled (in minimap2 mode).
  • abundance_threshold reduces the number of references to be displayed in IGV.

Fixed

  • Complete previous taxonomic rank when there are no parent nodes.
  • Checking the correspondence between the reference and ref2taxid now also works with compressed references.
  • exclude-host can input a file in the EPI2ME Desktop Application.

[v2.9.4]

Added

  • Statistics derived from minimap2 alignment are now in the output if the keep_bam option is enabled.
  • Reads below percentages of identity (min_percent_identity) and the reference covered (min_ref_coverage) are considered as unclassified in the minimap2 approach.

Fixed

  • Request less memory if kraken2_memory_mapping is used.
  • Show the percentage of each species when hovering over the taxonomy bar plot.

Changed

  • bam folder within output has been renamed to bams.

[v2.9.3]

Fixed

  • Files that are empty following the fastcat filtering are discarded from downstream analyses.

[v2.9.2]

Fixed

  • "Can only use .dt accessor with datetimelike values" error in makeReport
  • "invalid literal for int() with base 10" error in makeReport

[v2.9.1]

Fixed

  • Avoid argument list lengths that may be too long when using glob patterns.
  • The Kraken2 pipeline sometimes reporting wrong numbers for unclassified counts.

[v2.9.0]

Fixed

  • Minimum memory requirement for custom databases depends on the database size.

Added

  • Workflow now accepts BAM or FASTQ files as input (using the --bam or --fastq parameters, respectively).
  • Run ID's now included in the output BAM files.
  • Add nextflow tags to keep track of the samples along the different processes.
  • Output BAM files without host reads.

Changed

  • Default for --n_taxa_barplot increased from 8 to 9.
  • Replace the eCharts barplot with an equivalent Bokeh version.
  • --database_set parameter is now Standard-8 when the --classifier parameter is set to kraken2.

[v2.8.0]

Added

  • Standard 8GB kraken2 database.
  • Update docs.

Fixed

  • Heatmap generated when --minimap2_by_reference is enabled references with a mean coverage of less than 1% of the one with the largest value are omitted.

[v2.7.0]

Fixed

  • Use store_dir without staging files from the web. Kraken2 can run offline if the databases have been previously stored.
  • Fastcat plots showing the stats in the report before removing host sequences when --exclude_host in the minimap2 pipeline.
  • Real time kraken workflow hanging indefinitely when attempting to start kraken server with too many threads.

Added

  • Minimap pipeline is also able to use store_dir to store databases and run offline if the databases have been previously stored.
  • Kraken2 pipeline accepts a sample sheet if the real time option is disabled.
  • Only taxa present in the abundance table above the --abundance_threshold will appear in the alignment summary table (which is only generated when --minimap2_by_reference is enabled).

Removed

  • --bracken_dist: the bracken additional file for the database must be included in the database folder, as it is in the kraken2 indexes and when the database is generated.
  • Default local executor CPU and RAM limits

Changed

  • --watch_path is now called --real_time and enables the kraken2 pipeline to classify reads as they are written with watch_path.
  • The kraken2 workflow can now be used without --real_time, this will use the serverless kraken2 executable.
  • Barcode directories must now be named in the format barcodeNN, where NN is at least two digits (e.g. barcode01).
  • Barcode directories must now have the same number of characters (e.g. barcode01 cannot be provided with barcode001).

[v2.6.1]

Fixed

  • Broken report when the dataframe is filtered using the --abundance_threshold.
  • Taxonomy abundances barplot was not showing more abundant species.

[v2.6.0]

Fixed

  • Broken plots caused by single quotes in NCBI taxon names.

Added

  • Add the abundance_table_rank.tsv in the output for the last analysed rank.
  • Optional --minimap2_by_reference parameter to output the sequencing depth and coverage of each matched reference in the database.

[v2.5.0]

Added

  • --kraken_confidence to specify a threshold score.
  • --exclude_host: Optional parameter can accept a FASTA/MMI file with a host reference to be excluded from the analysis.
  • --include_kraken2_assignments: Output the classification of each read.

[v2.4.4]

Added

  • --abundance_threshold: filter taxa based on their abundances.
  • --n_taxa_barplot: control the number of taxa displayed in the barplot.
  • Plot the taxa abundance distribution (e.g. Species abundance distribution plots).

[v2.4.3]

Changed

  • Remove abricate version if AMR does not run.

Fixed

  • Changelog format.

Added

  • Alpha diversity indices: Berger-Parker dominance index, Fisher’s alpha.

[v2.4.2]

Changed

-Bumped minimum required Nextflow version to 23.04.2.

[v2.4.1]

Fixed

  • Kraken2 pipeline: all the samples are shown in the report.

Changed

  • Any sample aliases that contain spaces will be replaced with underscores.

[v2.4.0]

Added

  • Antimicrobial resistance gene identification using Abricate.

[v2.3.0]

Added

  • A new option kraken2_memory_mapping to avoid kraken2 loading the database into process-local RAM.
  • --keep_bam parameter to write BAM files into the output directory (minimap pipeline).
  • Lineages sunburst plot added to the report.
  • SILVA.138 database available for both kraken2 and minimap2 pipelines.

Changed

  • bracken_level parameter has been replaced by taxonomic_rank to choose the taxonomic rank at which to perform the analysis. It works in both pipelines.
  • Updated example command displayed when running --help.
  • Updated GitHub issue templates to force capture of more information.
  • Bumped minimum required Nextflow version to 22.10.8.
  • Enum choices are enumerated in the --help output.
  • Enum choices are enumerated as part of the error message when a user has selected an invalid choice.

Fixed

  • Replaced --threads option in fastqingress with hardcoded values to remove warning about undefined param.threads.
  • Extract reads using --minimap2filter and --minimap2exclude filters. The extracted reads are in the output/filtered folder.

[v2.2.1]

Added

  • A new option --min_read_qual to filter by quality score.
  • Configuration for running demo data in AWS
  • AWS configuration for external kraken2_server for demonstration at LC23

Fixed

  • Fix minimum and maximum length read filter.

Removed

  • Default region and AWS CLI path for AWS batch profile

[v2.2.0]

Changed

  • Updated existing databases.
  • Docker will use an ARM platform image on appropriate devices.

Added

  • A new PFP-8 database option.
  • New test_data with Bacteria, Archaea and Fungi.

Fixed

  • Fix file names when exporting tables.

[v2.1.1]

Fixed

  • Include 'kingdom' for Eukarya.
  • Add ability to use an external kraken2 server.

[v2.1.0]

Updated

  • New fastqingress.
  • New report with ezcharts.

Added

  • Stacked barplot for most abundant taxa.
  • Show rank information in abundance tables.
  • Export function from tables.

Fixed

  • Fix crash in the report with one sequence in the fastq.
  • Use kraken2 with parallelization in single client.

[v2.0.10]

Fixed

  • Remove symbolic links from store_dir.

[v2.0.9]

Changed

  • Update kraken databases to latest and ensure relevant taxdump is used.

Added

  • Plot species richness curves.
  • Provide (original and rarefied, i.e. all the samples have the same number of reads) abundance tables listing taxa per sample for a given taxonomic rank.
  • Add diversity indices.
  • Memory requirement help text.

[v2.0.8]

Fixed

  • Example_cmd in config.
  • Minimap2 subworkflow fixed for when no alignments.
  • Remove quality 10 parameter from Minimap2 subworkflow.

[v2.0.7]

Fixed

  • Issue where processing more than ~28 input files lead to excessive memory use.
  • Minor typos in docs.

[v2.0.6]

Changed

  • Updated description in manifest

[v2.0.5]

Fixed

  • The version in the config manifest is up to date.

[v2.0.4]

Fixed

  • Issue where discrepancies between taxonomy and databases led to error.

Added

  • nextflow run epi2me-labs/wf-metagenomics --version will now print the workflow version number and exit.

Changed

  • Parameter name for selecting known database is now --database_set (was --source).
  • Add classifier parameter and only allow running of minimap2 or kraken2 workflow.
  • Workflow logic in kraken workflow has been reorganised for simpler parallelism.

Removed

  • -profile conda is no longer supported, users should use -profile standard (Docker) or -profile singularity instead
  • --run_indefinitely parameter removed, instead implied when --read_limit set to null.

[v2.0.3]

Fixed

  • Add a test and fix for if all files in one directory are unclassified
  • Check if fastq input exists

[v2.0.2]

Changed

  • Use store directory for database.
  • Use per file kraken_report instead of cumulative.
  • Kraken2-server v0.0.8.

Added

  • Add a run indefinitely parameter.

Fixed

  • Batch size breaking fastcat step.
  • Consider white space in bracken report.
  • Handling for unclassified with Bracken.

[v2.0.1]

Fixed

  • Handling with kraken2 for single input file

Removed

  • Removed sanitize option

[v2.0.0]

Fixed

  • Output argument in Fastqingress homogenised.

Changed

  • Bumped base container to v0.2.0
  • Kraken workflow now in real time mode with watch_path
  • Kraken and Minimap now in subworkflows
  • Fastqingress metadata map
  • Can only run Kraken or Minimap subworkflow not both
  • Better help text on cli
  • Fastq ingress and Args update
  • Set out_dir option type to ensure output is written to correct directory on Windows

[v1.1.4]

Added

  • pluspf8, ncbi_16s_18s_28s_ITS databases
  • Add all sample tool combinations to report

Changed

  • Enable kraken2 by default
  • Clarify error messages

Fixed

  • Handle no assignments bracken error

[v1.1.3]

Added

  • New docs format.
  • Render bokeh.

[v1.1.2]

Changed

  • Update nextflow_schema.json

[v1.1.1]

Fixed

  • Overriding taxonomy now works correctly
  • Added missing threads param to kraken2

[v1.1.0]

Added

  • Report now includes dynamic sankey visualisation and table
  • Nextflow schema

Changed

  • Updated to use new fastqingress module, permitting single .fastq input
  • Rewired DAG to rely on sample id's rather than filenames

Fixed

  • Handle bracken failure when there are no classifications
  • Handle cyclic dag issue when taxonomy has duplicate names

[v1.0.0]

  • First release.