Skip to content

Latest commit

 

History

History
137 lines (94 loc) · 6.28 KB

README.md

File metadata and controls

137 lines (94 loc) · 6.28 KB

igzip wrapper

igzip provides the fastest zlib/gzip-compatible compression/decompression in x86 CPU around the world so far (Oct 27, 2021), which is a submodule of Intel(R) Intelligent Storage Acceleration Library (ISA-L) optimized by many low-level magics including assembly language and AVX512 instructions, to gain best performance especially for Intel(R) x86 platform. Here's the brief intro:

  • Supports RFC 1951 DEFLATE standard like canonical Zlib.
  • 4 levels of compression, which affect operation performance and compression ratio.
  • Multi-threading for compression, maximum 8 threads could be used.
  • Optimized by low-level instructions to meet performance-critical scenarios.

For more details of Zlib solutions of ISA-L, please see here: Zlib Solutions of Intel(R) ISA-L and Intel(R) IPP.

To provide out-of-the-box compression/decompression functions, we proposed the igzip wrapper, which supports the direct transformation between C-style string and gzip file, based on the awesome ISA-L.

Supported API

/* igzip inflate wrapper */
int decompress_file(const char *infile_name, unsigned char *output_string, size_t *output_length);

/* igzip deflate wrapper */
int compress_file(unsigned char *input_string, size_t input_length, const char *outfile_name, int compress_level, int thread_num);

Current loose coupling structure is easy to customize and add new features like streaming inflate or deflate, feel free to copy paste to adapt it to your design!

Build

Prerequisites

  • CMake v3.2 or later
  • ISA-L v2.30.0 or later
  • pthreads

Build shared library

git clone [email protected]:ueqri/igzip-wrapper.git
mkdir -p igzip-wrapper/build
cd igzip-wrapper/build
cmake ..
make -j

Note: the multi-threading support for deflating (i.e. compression) is enabled by default, if you want to build single thread version, please add the option like cmake -DMULTI_THREADED_DEFLATE=OFF .. instead. As for inflating, only single thread is supported restricted by the nature of gzip format.

Link the library with your program

  • Copy igzip_wrapper.h and libigzipwrap.so to your program.
  • Include the wrapper header in the C/C++ source.
  • Link the library when building the program.

Test

If you want to build tests for inflate & deflate APIs and see the time cost in your machine, replace the previous cmake .. with the following command.

  cmake -DBUILD_TEST=ON ..

Usage of the test executables in build directory:

# Test decompression with check file
./inflate <path-to-source-gzip-file> <path-to-uncompressed-check-file>
# Test compression and use inflate API to check
./deflate <path-to-uncompressed-source> <path-to-output-gzip-file>

Benchmark

A series of comprehensive benchmarks were done by Ruben Vorderman (thanks @rhpvorderman) of Python community.

Details of the benchmarks

The system was based on Ryzen 5 3600 with 2x16GB DDR4-3200 memory, and running Debian 10.

All benchmarks were performed on a tmpfs which lives in memory to prevent I/O bottlenecks, and using hyperfine for better analysis.

The test file was a 5 million read FASTQ file of 1.6 GB . These type of files are common in bioinformatics at 100+ GB sizes so are a good real-world benchmark.

Also benchmarked pigz on one thread as well, as it implements zlib but in a faster way than gzip. Zstd was benchmarked as a comparison.

Versions: 
gzip 1.9 (provided by debian)
pigz 2.4 (provided by debian)
igzip 2.25.0 (provided by debian)
libdeflate-gzip 1.6 (compiled by conda-build with the recipe here: https://github.com/conda-forge/libdeflate-feedstock/pull/4)
zstd 1.3.8 (provided by debian)

Compression: By default level 1 is chosen for all compression benchmarks. Time is average over 10 runs.

COMPRESSION
program            time           size   memory
gzip               23.5 seconds   657M   1.5M
pigz (one thread)  22.2 seconds   658M   2.4M
libdeflate-gzip    10.1 seconds   623M   1.6G (reads entire file in memory)
igzip              4.6 seconds    620M   3.5M
zstd (to .zst)     6.1 seconds    584M   12.1M

Decompression: All programs decompressed the file created using gzip -1. (Even zstd which can also decompress gzip).

DECOMPRESSION
program            time           memory
gzip               10.5 seconds   744K
pigz (one-thread)  6.7 seconds    1.2M
libdeflate-gzip    3.6 seconds    2.2G (reads in mem before writing)
igzip              3.3 seconds    3.6M
zstd (from .gz)    6.4 seconds    2.2M
zstd (from .zst)   2.3 seconds    3.1M

As shown from the above benchmarks, using Intel's Storage Acceleration Libraries may improve performance quite substantially. Offering very fast compression and decompression. This gets igzip in the zstd ballpark in terms of speed while still offering backwards compatibility with gzip.

From my side, after controlling for other variables, I found the decompression speedup of igzip(8 threads, v2.30.1) is ~2x faster than Zstd(v1.5.0), and ~4x faster than gzip(v1.11). And the optimization of compression is even much higher.

Reference

Zlib Solutions of Intel(R) Intelligent Storage Acceleration Library and Intel(R) Integrated Performance Primitives

Include much faster DEFLATE implementations ISA-L in Python's gzip and zlib libraries

Storage acceleration with ISA-L (From Page 9)

ISA-L Update & Usercases Sharing (From Page 17)

lzbench issue of adding ISA-L to comparison