27 Aug 14:09

gakhov

e72a20e

Implement frequency algorithms Pre-release

Pre-release

NEW

Count Sketch algorithm implementation
Count-Min Sketch algorithm implementation

Count Sketch and Count–Min Sketch are simple space-efficient probabilistic data structures
that are used to estimate frequencies of elements in data streams and can address the Heavy hitters problem.

Count Sketch was proposed by Moses Charikar, Kevin Chen, and Martin Farach-Colton in 2002.
Count–Min Sketch was presented in 2003 by Graham Cormode and Shan Muthukrishnan and published in 2005.

Assets 2

12 Jul 22:05

gakhov

0.4.1

528cfa6

Maintenance release Pre-release

Pre-release

FIXES

Fix README to indicate that the minimal required Cython version is 0.28 (Thank you @jseabold)
Explicitly require Cython 0.28+ and Python 3.5+ in setup.py
Fix cardinality tests (or at least come one step closer to the correct test approach without actual big data)

Assets 2

09 May 09:50

gakhov

0.4.0

f291f34

Implemented HyperLogLog algorithm Pre-release

Pre-release

NEW

HyperLogLog algorithm implementation

HyperLogLog algorithm was proposed by Philippe Flajolet, Éric Fusy, Olivier Gandouet, and Frédéric Meunier in 2007. There is a number of modifications of the algorithm, but this implementation uses the original version with a 32-bit hash function.

FIXES

Fix cardinality estimation in Bloom filters
Correct the inline example for q-digest
Removed support for Python < 3.5

Assets 2

03 Nov 14:38

gakhov

0.3.0

16d94a3

Implemented algorithms for cardinality and rank estimation Pre-release

Pre-release

In this pre-release implemented the following algorithms:

Cardinality problem

Linear counter
Probabilistic counter (Flajolet–Martin algorithm)

Rank problem

Quantile Digest (q-digest)

Additionally, the overall code has been improved, added tests and examples how to use the implemented data structures.

Assets 2

15 Aug 16:10

gakhov

0.2.0

b6417ab

Implement classical data structures for Membership queries Pre-release

Pre-release

In this release first 2 classical data structures have been implemented:

Classical Bloom Filter
Counting Bloom Filter

In order to support memory-efficient representation we also developed the BitCounter and BitVector.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NEW

FIXES

NEW

FIXES

Releases: gakhov/pdsa

Implement frequency algorithms

NEW

Maintenance release

FIXES

Implemented HyperLogLog algorithm

NEW

FIXES

Implemented algorithms for cardinality and rank estimation

Implement classical data structures for Membership queries