Skip to content

Releases: sourmash-bio/sourmash

v4.0.0

02 Mar 19:39
4f43288
Compare
Choose a tag to compare

Major changes for 4.0

4.0 is a major new version of sourmash, and it contains a number of new and breaking features.

Please see our migration guide for more information on how to migrate from v3.x to version 4.0!

Numerical output and search results are unchanged

There are no changes to numerical output or search results in this release; you should get the same results with v4 as you get with v3, except where command-line parameters need to be adjusted as noted below (see: protein ksize #1277, lca summarize changes #1175, sourmash gather on signatures without abundance #1328). Please file an issue if your results change!

New or changed behavior

  • default SBT storage is now .sbt.zip (#1174, #1170)
  • add sourmash sketch command for creating signatures (#1159)
  • protein ksizes in MinHash are now divided by 3, except in sourmash compute (#1277)
  • refactor MinHash API and implementation: add, iadd, merge, hashes, and max_hash (#1282, #1154, #1139, #1301)
  • add HyperLogLog implementation (#1223)
  • SourmashSignature.name is now a property (not a method): use str(sig) instead of name() (#1179, #1232)
  • lca summarize no longer merges all signatures, and uses hash abundance by default (#1175)
  • index and lca index (#1186, #1222) now support --from-file and no longer require signature files on command line
  • --traverse-directory is now on by default for signature loading behavior (#1178)
  • sourmash sketch and sourmash compute no longer create empty signatures from empty files and stdin (#1347);
  • sourmash sketch and sourmash compute set sig.filename to empty string when filename is - (#1347);

Feature removal

  • remove Python 2.7 support (& end Python 2 compatibility) (#1145, #1144)
  • remove lca gather (#1307)
  • remove 10x support from sourmash compute (#1229)
  • remove 'dump' command (#1157)

Feature/function deprecations

  • deprecate sourmash compute (#1159)
  • deprecate load_signatures, sourmash.load_one_signature, create_sbt_index, and load_sbt_index (#1279, #1304)
  • deprecate import_csv in favor of new sourmash sig import --csv (#1281)

Refactoring, improvements, and minor bug fixes:

  • accept file list in sourmash sig cat (#1236)
  • add unique_intersect_bp and gather_result_rank to gather CSV output (#1219)
  • remove deprecated minhash functions (#1149)
  • fix Rust panic error in signature creation (#1172)
  • cache nodes in SBT during search (#1161)
  • fix two bugs in gather --output-unassigned (#1156)
  • Refactor the gather code so that it uses 'hashes' instead of 'mins' (#1329)
  • Update output from gather w/o abundances, so that abund output is empty instead of 0(#1328)

Documentation updates

  • substantial revisions and updates to the documentation (#1283)
  • add information about versioning, migrations, etc to the docs (#1153)

Infrastructure and CI changes:

  • update finch requirement from 0.3.0 to 0.4.1 (#1290)
  • update rand for test, and activate "js" feature for getrandom (#1275)
  • dev updates (configs and doc) (#1298)
  • move wheel building from Travis to GitHub Actions (#1295)
  • fix new clippy warnings from Rust 1.49 (#1267)
  • use tox for running tests locally (#696)
  • CI: small build fixes (#1252)
  • CI: Fix releases in GitHub Actions (#1250)
  • update build_wheel action paths
  • CI: moving python tests from travis to GH actions (#1249)
  • CI: move wheel building to GitHub actions (#1244)
  • remove last .rst file from docs (#1185)
  • update CI for latest branch name change (#1150)

v3.5.1

16 Feb 00:42
@ctb ctb
3bfd0fa
Compare
Choose a tag to compare

Feature deprecations

  • add deprecation warning for sourmash compute --input-is-10x (#1326)
  • add warnings about new sourmash lca summarize behavior (#1326)
  • add warning for new behavior of MinHash.merge(...) (#1326)
  • add deprecation warning for TarStorage (#1165)

Infrastructure and CI changes:

  • Backport github actions to stable branch (3.5.x) (#1317)

v3.5.0

11 Aug 19:27
@ctb ctb
111b46e
Compare
Choose a tag to compare

This is the first of several minor releases (v3.5.x) from the new stable branch. These releases focus on preparing for sourmash v4.0 by introducing deprecations and warnings for features that will be removed in v4.0.

Refactoring and deprecations:

  • MinHash class refactoring (#1128, #1129); many deprecations for 4.0 and 5.0
  • sourmash dump deprecated, for removal in 4.0 (#1147)
  • import sourmash_lib deprecated, for removal in 4.0 (#1143)

Cleanup:

  • remove mentions of ijson and khmer (no longer needed dependencies) #1140

Documentation:

  • Simplify and clean up README (#1124)
  • Add sourmash logo to docs and README (#1127)
  • update release process and release notes (#1125)

Rust:

  • Update typed-builder requirement from 0.6.0 to 0.7.0 (#1121)

v3.4.1

23 Jul 00:02
@ctb ctb
f8d0262
Compare
Choose a tag to compare

Major new features:

  • Document sourmash.fig usage and behavior; enable output of compare clustering with labels (#859)
  • Adds --majority option to lca classify using majority vote algorithm (#1113)

Minor improvements:

  • MinHash compatibility check to sourmash sig intersect (#1116)

Bugs fixed:

  • add ksize selectors back into sourmash sig functions (#1105)

Documentation updates:

  • Minor updates to release procedure (#1102)
  • Update DB links in docs (#1084)

v3.4.0

14 Jul 14:02
a6800f1
Compare
Choose a tag to compare

Major new features:

  • enable seamless loading of signatures from indexed databases (#1059, #1083, #1090)
  • add signature cat and signature split commands to combine/split signature files (#1044, #1074)
  • add compute-optimized MinHash (for small scaled or large cardinalities) in Rust (#1045)
  • optionally weight lca summarize output by hashval abundance. (#1022)
  • enable moltypes other than DNA in LCA databases (#1013)

Minor improvements:

  • add --num-results/-n to gather (#1047)
  • improve lca index error message when inserting num signature (#1076)
  • autodetect FASTA/FASTQ files if given as signatures (#1078)
  • add is_lineage_match, pop_to_rank, make_lineage to lca_utils (#1081)
  • use stricter niffler versions and add new gz feature to it (#1070)
  • added MinHash.clear() and MinHash.add_hash_with_abundance to Python API (#1046)

Bugs fixed:

  • investigations and fixes around new gather behavior. (#1001)

Refactoring:

  • move tests from test_lca into test_lca_functions (#1035)
  • remove unused run_shell_cmd function (#1032)
  • refactor some tests in test_sourmash.py to use @utils.in_tempdir decorators (#1020)
  • use install scripts from py-ipfs-http-client (#1068)

Documentation:

  • Improve documentation around abundance projection (#1073)
  • Replace recommonmark with myst (docs) (#1021)
  • Fix doctest filename error (#1040)

Thanks to @luizirber @ctb @bluegenes @erikyoung85 for their contributions!

3.3.1

27 May 02:04
9c94af9
Compare
Choose a tag to compare

Improvements:

  • Deal with duplicated MD5 in storages (#994)
  • Hide internal representation in core, and update FFI and cbindgen (#986)

Build, CI and docs:

  • upgrade sourmash index usage docs on CLI (#975)
  • Fix two temp files output locations in tests (#989)

version 3.3.0

04 May 17:26
@ctb ctb
b17ed13
Compare
Choose a tag to compare

Improvements:

  • add ZipStorage, support loading SBT databases from storage; .sbt.zip extensions. (#648)
  • Replace khmer.Nodegraph with rust nodegraph; ~5x speedup of SBT search & gather. (#799)

Bugs:

  • Document and (lightly) fix the LCA_Database API. (#966)
  • Fix bug when using Python 3.5 and before; refactor LCA_Database tests (#962)

Documentation:

  • Document gather abund tests a bit better; minor refactoring (#886)
  • Improve lca index error (#963)

version 3.2.3

21 Apr 13:41
@ctb ctb
a42508f
Compare
Choose a tag to compare

Incompatibilities with previous versions due to bugs:

  • sourmash gather on SBT databases was setting --threshold-bp=0 in all cases. This was fixed in #942, and output may change. Specify --threshold-bp=0 to recover old behavior.

Improvements:

  • refactor LCA_Database class to support programmatic creation. (#946)
  • add --singleton option to lca summarize (#922)
  • update gather to calculate fraction of match that was in original query (#938)
  • add compare --containment (#937)
  • add --outdir argument to sourmash compute (#935)
  • improvements to sourmash argparse output for compute. (#931)

Bugs:

  • fix lca classify bug with -o (#902)
  • set_abundances now works with large signatures (#911)
  • test & fix LinearIndex, SBT, and LCA gather thresholding. (#942)

Build, CI and docs:

  • create .sonarcloud.properties
  • pin virtualenv version for asv, and also run GH actions on rust version tags (#903)
  • add make clean & rustup update to dev docs (#927)

v3.2.2: 3.2.2

09 Feb 05:28
7af043a
Compare
Choose a tag to compare

Improvements:

  • more refactoring of MinHash API (#889)
  • add_hash_with_abundance method in core library (#892)
  • Replace mins_push and abunds_push with set_abundances (#887)
  • More refactoring of MinHash comparison code (#882)
  • better sourmash compare error handling (#876)

Bugs:

  • add_hash with num doesn't set abundances properly (#891)
  • name signatures based on md5sum, not on name() (#884)

Build, CI and docs:

  • update docs for how to run Rust tests (#888)

v3.2.1: 3.2.1

04 Feb 07:42
bb90fc9
Compare
Choose a tag to compare

Bugs:

  • re-add 'signature' as alias for 'sig' (#881)