- add missing mac os wheels
- add support for Python 3.13
- drop support for Python 3.8
- fix potentially incorrect results of
jaro_winkler
when using high prefix weights
- improve type hints
- upgrade
rapidfuzz-cpp
tov3.0.0
- drop support for Python 3.7
- added keyword argument
pad
to Hamming distance. This controls whether sequences of different length should be padded or lead to aValueError
- upgrade to
Cython==3.0.3
- add support for Python 3.12
- drop support for Python 3.6
- add wheels for windows arm64
- upgrade
rapidfuzz-cpp
tov2.0.0
- relax dependency requirement on
rapidfuzz
- fix function signature of
get_requires_for_build_wheel
- type hints for
editops
/opcoded
/matching_blocks
did not allow any hashable sequence
- type hints did not get installed
- fix incorrect result normalization in
setratio
andseqratio
- fix support for cmake versions below 3.17
- fix version requirement for
rapidfuzz-cpp
when building against a previously installed version
- modernize cmake build to fix most conda-forge builds
- Added support for Python 3.11
- fix matching_blocks conversion for empty editops
- added in-tree build backend to install cmake and ninja only when it is not installed yet and only when wheels are available
- fix broken matching_blocks conversion
- use
matching_blocks
/apply
/remove_subsequence
/inverse
implementation from RapidFuzz
- stop adding data to wheels
- fix segmentation fault on some invalid editop sequences in subtract_edit
- detect duplicated entries in editops validation
- add musllinux wheels
- add missing type hints
- Add type hints
- implement all Python wrappers mostly with cython
- replace usage of deprecated Python APIs
- fix behavior of median and median_improve
- Allow installation from system installed versions of
rapidfuzz-cpp
- Indel.normalized_similarity was broken in RapidFuzz v2.0.0 (see #20)
-
Fixed memory leak in error path of setratio
-
Fixed out of bound reads due to uninitialized variable in median
- e.g. quickmedian(["test", "teste"], [0, 0]) caused out of bound reads
- Use a faster editops implementation provided by RapidFuzz
- Reduce code duplication
- reuse implementations from rapidfuzz-cpp
- Transition to scikit-build
- Removed support for Python 3.5
- Add support for RapidFuzz v1.9.*
- Add support for Python 3.10
- Update SequenceMatcher interface to support the autojunk parameter
- Drop Python 2 support
- Fixed free of non heap object due caused by zero offset on a heap object
- Fixed warnings about missing type conversions
- Fix segmentation fault in subtract_edit when incorrect input types are used
- Fixed unchecked memory allocations
- Implement distance/ratio/hamming/jaro/jaro_winkler using rapidfuzz instead of providing a own implementation
- Implement Wrapper for inverse/editops/opcodes/matching_blocks/subtract_edit/apply_edit using Cython to simplify support for new Python versions
- Maintainership passed to Max Bachmann
- use faster bitparallel implementations for distance and ratio
- avoid string copies in distance, ratio and hamming
- Fix usage of deprecated Unicode APIs in distance, ratio and hamming
- Fixed incorrect window size inside Jaro and Jaro-Winkler implementation
- Fixed incorrect exception messages
- Removed unused functions and compiler specific hacks
- Split the Python and C implementations to simplify building of the C library
- Fixed multiple bugs which prevented the use as C library, since some functions only got defined when compiling for Python
- Build and deliver python wheels for the library
- Fixed incorrect allocation size in lev_editops_matching_blocks and lev_opcodes_matching_blocks
- Fixed handling of numerous possible wraparounds in calculating the size of memory allocations; incorrect handling of which could cause denial of service or even possible remote code execution in previous versions of the library.
- Fixed a bug in StringMatcher.StringMatcher.get_matching_blocks /
extract_editops for Python 3; now allow only
str
editops on both Python 2 and Python 3, for simpler and working code. - Added documentation in the source distribution and in GIT
- Fixed the package layout: renamed the .so/.dll to _levenshtein, and made it reside inside a package, along with the StringMatcher class.
- Fixed spelling errors.
-
Fixed a bug in setup.py: installation would fail on Python 3 if the locale did not specify UTF-8 charset (Felix Yan).
-
Added COPYING, StringMatcher.py, gendoc.sh and NEWS in MANIFEST.in, as they were missing from source distributions.
- Added Levenshtein.h to MANIFEST.in
- Python 3 support, maintainership passed to Antti Haapala
- Made python-Lehvenstein Git compatible and use setuptools for PyPi upload
- Created HISTORY.txt and made README reST compatible
- apply_edit() broken for Unicodes was fixed (thanks to Radovan Garabik)
- subtract_edit() function was added
- Hamming distance, Jaro similarity metric and Jaro-Winkler similarity metric were added
- ValueErrors raised on wrong argument types were fixed to TypeErrors
- a poor-but-fast generalized median method quickmedian() was added
- some auxiliary functions added to the C api (lev_set_median_index, lev_editops_normalize, ...)
- fixed missing `static' in the method list
- some compilation problems with non-gcc were fixed
v0.8.0
- median_improve(), a generalized median improving function, was added
- an arbitrary length limitation imposed on greedy median() result was removed
- out of memory should be handled more gracefully (on systems w/o memory overcomitting)
- the documentation now passes doctest
- fixed greedy median() for Unicode characters > U+FFFF, it's now usable with whatever integer type wchar_t happens to be
- added missing MANIFEST
- renamed exported C functions, all public names now have lev_, LEV_ or Lev prefix; defined lev_byte, lev_wchar, and otherwise santinized the (still unstable) C interface
- added edit-ops group of functions, with two interfaces: native, useful for string averaging, and difflib-like for interoperability
- added an example SequenceMatcher-like class StringMatcher
- a segfault in seqratio()/setratio() on invalid input has been fixed to an exception
- optimized ratio() and distance() (about 20%)
- Levenshtein.h header file was added to make it easier to actually use it as a C library
- a segfault in setratio() was fixed
- median() handles all empty strings situation more gracefully
- new functions seqratio() and setratio() computing similarity between string sequences and sets
- Levenshtein optimizations (affects all routines except median())
- all Sequence objects are accepted, not just Lists
- setmedian() finding set median was added
- median() initial overhead for Unicodes was reduced
- ratio() and distance() now accept both Strings and Unicodes
- removed uratio() and udistance()
- Levenshtein.c is now compilable as a C library (with -DNO_PYTHON)
- a median() function finding approximate weighted median of a string set was added
- Inital release