Skip to content

Changelog

Hannes Hauswedell edited this page Aug 26, 2024 · 52 revisions

Version 3.1.0 [2024/08/26]

This includes the changes used in https://doi.org/10.1093/bioinformatics/btae097

Features

  • Two new profiles: pairs-default and pairs-sensitive that are useful in combination with higher --num-matches. 6cd1d33
  • Improvements to the core algorithm. 40b49a0

Notable bug fixes

  • In edge cases, the alignment coordinates may have been calculated wrongly. f1ebc9a
  • Proper exception handling. aad68eb
  • Accept .fna and .faa as extensions for FASTA files. 05262cd

Compatibility

  • The command line interface is compatible to 3.0.0.
  • The on-disk format is compatible to 3.0.0.
  • The output generated for most input files has changed slightly since 3.0.0.

Dependencies

  • same as for 3.0.0

Version 3.0.0 [2023/07/25]

Features

  • New program mode for searching bisulphite data.
  • The nucleotide mode has received much more testing and optimisation.
  • Huge overhaul of the algorithm; Lambda3 is up to 3x faster than Lambda2 and uses less memory.
  • Use --profile fast or --profile sensitive to select finetuned parameter combinations that are faster or more sensitive than the default.

Compatibility

  • The command line interface is very similar to Lambda2's, but some options have been added and some removed.
  • The on-disk format of the index has changed. You need to recreate your index files or download new ones from the wiki. Indexes are now single files and may be used in compressed state.
  • GCC-11 or later and -std=c++20 is required to build.
  • Requires 64bit Intel or AMD CPU with SSE4 and POPCNT instructions.

Dependencies

Version 2.0.1 [2023/07/18]

features

  • use --bit-score in addition to or instead of --e-value

bug fixes

  • fix 32bit builds
  • bug in BLASTN evalue calculation (slightly changed values)
  • various typos and documentation fixes
  • fix dispatcher script on macOS

compatibility

  • the command line interface is identical to lambda >= 1.9.4
  • the on-disk index format is compatible to lambda >= 1.9.3
  • requires seqan >= 2.3.1; binary packages based on seqan-2.4.0
  • compatible with C++17 if used with seqan >= 2.4.0

Version 2.0.0 [2019/01/11] stable

This is the 2.0 stable release of lambda. It is identical to 1.9.5. The on-disk format is guaranteed to be preserved and all command line options and internal parameter will remain fixed, unless there are bugs.

Version 1.9.5 [2018/05/30] experimental

bug fixes

  • BLASTN was broken (#115)
  • wrong escaping of " and ' in command line arguments (#116)
  • mixed lower-case / upper-case letters led to error on indexing (#117)
  • divide-by-zero with very small databases (#118)
  • lca-computation error with some certain sequences that have no taxonomy information (#119)

compatibility

  • the command line interface is identical to lambda >= 1.9.4
  • the on-disk index format is compatible to lambda >= 1.9.3
  • requires seqan >= 2.3.1; binary packages based on seqan-2.4.0
  • compatible with C++17 if used with seqan >= 2.4.0

Version 1.0.3 [2018/02/17] stable-branch

bug-fixes and minor changes

  • fix build with seqan-2.4.0 (#114)

compatibility

  • both the interface and the generated index files are fully compatible to the 0.9.* and 1.0.* series
  • requires seqan-2.2.0 or later; packages built from seqan-2.4.0

Version 1.9.4 [2018/02/05] experimental

features and usability improvements

  • all new single-executable interface with sub-commands (like git); see the LAMBDA → lambda2-guide (#88, #94)
    • the executable is now called lambda2
    • the subcommands mkindexp, mkindexn, searchp and searchn are currently supported
    • the -p/--program parameter has disappeared, instead one chooses via the command between nucleotide and protein search (#96)
    • many options are now auto-detected from the files, including all index-options and the source alphabets (DNA vs AminoAcid) (#6, #60)
    • all short options are now single-letter, some short-options specifiers where removed (#108)
  • man-pages are now automatically generated and included in the packages
  • generic and optimised binaries are now shipped within the same package and automatically selected
  • index generation is now 30% – 50% faster (#112)

bug fixes

  • crash on empty query sequences, fix requires seqan >= 2.4.0 (#111)
  • crash if output file is placed given in non-existent or non-writable directory (#113)

compatibility

  • the command line interface has changed considerably, please see the LAMBDA → lambda2-guide
  • the on-disk index format is compatible to 1.9.3
  • requires seqan >= 2.3.1; binary packages based on seqan-2.4.0
  • compatible with C++17 if used with seqan >= 2.4.0

Version 1.9.3 [2017/06/29] experimental

  • everything from 1.0.2, including checks for updates

features and usability improvements

  • faster searches through
    • two-step extensions for short reads (#36)
    • use of SIMD for short reads (#89)
  • smaller indexes again, approx. the size before 1.9.2 (#93)
  • species annotation (#76):
    • support for extracting RefSeq and UniParc accession IDs
    • support for UniProt .dat mapping files in addition to NCBI's .accession2taxid
  • the expected memory usage is pre-calculated and checked against available (#86)

bug-fixes and minor changes

  • fixed subject position error in some matches in 1.9.0 - 1.9.2 (#87)
  • stricter parameter checking that prevents wrong usage with obscure errors (#92)
  • better exception handling (#100)
  • some fixes to BLASTN and TBLASTX modes (#95, #102, #104)
  • rare crash on FreeBSD
  • stack overflow on long exact matches (especially BLASTP, BLASTN)

compatibility

  • the on-disk index format has changed and is not compatible to any previous version
  • requires seqan >= 2.3.1

Version 1.0.2 [2017/06/29] stable

features

  • lambda now informs you if updates are available, see Privacy for more details (#97)

bug-fixes and minor changes

  • various fixes from new SeqAn versions

compatibility

  • both the interface and the generated index files are fully compatible to the 0.9.* and 1.0.* series
  • requires seqan-2.2.0

Version 1.9.2 [2017/01/10] experimental

  • everything from 1.0.1

features

  • SIMD parallelization for short query sequences, e.g. Illumina; not yet default (#58)
  • species annotation of subject sequences, see Taxonomic Workflows (#63)
  • compile time option that enables larger protein databases sequences (#70)
  • bi-directional indexes supported; not yet default (#74)
  • taxonomic binning, see Taxonomic Workflows (#77)

bug-fixes and minor changes

  • crash when encountering input sequences that are shorter than a seed (#66)

compatibility

  • the on-disk index format has changed and is not compatible to any previous version
  • some of the optional --sam-bam-tags were renamed, see the wiki (#79)
  • requires seqan-2.3.1

Version 1.0.1 [2017/01/09] stable

features

  • can use .sam and .bam also as sequence input (via seqan-2.3)

bug-fixes and minor changes

  • minor spelling and documentation fixes (#65)
  • make Lambda build on 32bit platforms (#68)
  • make Release the default CMAKE_BUILD_TYPE again -- If you built Lambda yourself and you didn't set this, please rebuild (#71)

infrastructure

  • Lambda is now available in Debian as lambda-align
  • it is also built on many non-x86 platforms, including PowerPC and Sparc64

compatibility

  • both the interface and the generated index files are fully compatible to the 0.9.* and 1.0.* series
  • requires seqan-2.2.0

Version 1.9.1 [2016/08/19] experimental

bug-fixes and minor changes

  • fix SAM and BAM output (#61)

Version 1.9.0 [2016/08/18] experimental

major changes

  • new variable length seeding and new seeding strategy, much faster (#17)
  • new FM index with EPR dictionaries (faster, but bigger) (#57)
  • early support for SIMD operations in extension phase (only faster for small reads) (#58)
  • new database/index format, files now moved to sub-directory, better diagnostics (#7)

compatibility

  • support for .seg files and masking was removed, it yielded poor results and is superseded by variable length seeding (#47)
  • the new index format is incompatible to previous releases and it now uses the -i parameter (#7)
  • both, the command line options and the index-format are subject to change during the 1.9.* cycle!

Version 1.0.0 [2016/08/18] stable

bug-fixes and minor changes

  • wrong handling of empty databases or such with empty sequences (#54)

infrastructure

  • removed the git-subtree and retroactively replaced it with git submodule (#55)
    • much smaller repository to clone, if you still want SeqAn with Lambda, add --recursive
    • the 1.0.* series now depends on SeqAn-2.2.0 and not any longer on a development version
    • previous git clones have been invalidated and must be forced-pulled or newly cloned
  • significant decrease in binary size (6.3MB vs 30MB) and compile time (1.5m vs 10m) (#49)
  • improved continuous integration (#52)

compatibility

  • both the interface and the generated index files are fully compatible to the 0.9.* series

Version 0.9.4 [2016/04/08]

new features

  • support for soft-clipping in SAM/BAM IO via --sam-bam-clip (#43)

bug-fixes and minor changes

  • missing or redundant hard-clip indicators in SAM/BAM cigar strings (#51)

infrastructure

  • LLVM/Clang compiler >= 3.8.0 now supported (#27)
  • reduced build times on GCC via parallelization (#45)
  • continuous integration for OS X (#46)
  • OpenBSD is now supported as platform (although slower than other Unixes) (#48)
  • Intel Compiler >= 16.0.2 now supported (#50)

Version 0.9.3 [2015/12/11]

infrastructure

  • added app tests (#14)
  • added continuous integration via travis (#39)

bug-fixes and minor changes

  • using indexes that are read-only now works (#38)
  • fixed a crash in TBlastN-mode (#40)
  • fixed a crash in BlastN-mode in combination with suffix array index type (#41)

Version 0.9.2 [2015/11/27]

new features of lambda

new features of lambda_indexer

  • truncate subject IDs by default to save lots of space (can be deactivated) (#37)

bug-fixes and minor changes

  • using SEG files works again (#33)
  • compiler specific parts linked statically on Mac OS X (#34)
  • BLASTN indexes were incorrectly classified as old format (#35)
  • if an outdated index was detected the value 200 is returned by lambda, so scripts can automatically recreate indexes if using them fails (#35)

Version 0.9.1 [2015/10/23]

new features of lambda

  • mmapped IO for the database enabling faster startup and memory sharing between instances (#3)

new features of lambda_indexer

  • radixsort suffix array creation (originally by @meiers) resulting in over 30% less RAM and up to 30% speed-up (#11)

changes in command line interface of lambda_indexer

  • all previous algorithms based on sorting superseded by radixsort, please remove e.g. -a quicksort from your scripts

bug-fixes and minor changes

  • build in release mode by default (#29)
  • improved progress reporting during indexing (#31)
  • detect most cases where the index is incompatible (#32)

Version 0.9.0 [2015/09/14]

new features

  • ported to SeqAn 2.0 bringing in lots of smaller changes (#1)
  • support for column reordering and more columns in tabular output (#2)
  • gzipped and bzipped input and output files supported (#19)
  • some checks are performed on input data to detect wrong alphabets (#21)

compatibility

  • previously generated indexes are unfortunately not compatible with 0.9.*

changes in command line interface

  • hide most options by default (visible again with --full-help)

bug-fixes and minor changes

  • error in man-page (#5)
  • erroneously report "unexpected extension failure" (#9)
  • crash in TBlastX mode (#10)
  • parameter for number of matches not working (#13)
  • crash when putative duplicates heuristic is turned off (#22)
  • many small improvements to BlastIO

Version 0.4.7 [2014/12/12]

bug-fixes and minor changes

  • fix build on Darwin / MacOS X

availability

Version 0.4.5 [2014/12/05]

bug-fixes and minor changes

  • lambda_indexer now has a different suffix array construction algorithm
  • this works on larger files (where the old algorithm sometimes failed) and is fully parallelized
  • there is also a rough progress indication when indexing

availability

Version 0.4.1 [2014/11/10]

bug-fixes and minor changes

  • default index type was not set to FM in lambda_indexer

availability

Version 0.4.0 [2014/11/07]

performance changes in comparison to published version (0.2)

  • new default mode with 30-80% speed gains and up to 75% memory reduction over published version
  • double-indexing mode with speed gains > 100%
  • sensitivity slightly increased at the same time (1-2%)

changes in command line interface

  • renamed many parameters and changed some defaults
  • please look at lambda --help to see all the changes!!
  • better control of verbosity with -v parameter
  • threads now controlled with -t instead of environment variable

new features

  • BlastN mode now usable again and proper parameter-handling added for it
  • added percent identity cutoff in addition to e-value cutoff (-id)
  • added a limit for maximum number of matches per query sequence (-nm)
  • added abundancy heuristic (-pa) and priorization of hits to not look at all hits if number of hits >> chosen limit
  • single-indexing mode which has huge memory advantages (-qi none) [now default]
  • FM-Index is now also default
  • removed Lambda-Alphabets, since they currently provide little benefit over Murphy10

bug-fixes and minor changes

  • indeces with different settings (index type, alphabet) can now be created on the same fasta file without conflicts between them
  • changed pre-scoring heuristic to include region around match (-ps and -pt)
  • fixed build issues with gcc-4.8.x
  • FastQ support fixed

availability

Version 0.3 [2014/09/06]

performance changes in comparison to published version (0.2)

  • Speed increased by ~20%
  • Suffix-Array index memory consumption reduced from 16x to 6x input database size
  • experimental support for FM-index as index (instead of SA) [not widely tested, yet]

bug-fixes and minor changes

  • small bugs in BLAST output formats corrected

availability

Version 0.2 [2014/04/07] published version

  • multiple optimizations
  • added option to partition the query sequences
  • added overlapping seeds capability

availability

Version 0.1 [2014/01/15] initial release

Clone this wiki locally