Skip to content
cowtowncoder edited this page Apr 9, 2011 · 51 revisions

Overview

This project is focus on building a comprehensive benchmark for comparing time and space efficiency of open source compression codecs on JVM platform. Codecs to include need to be accessible from Java (and thereby from any JVM language) via either pure Java interface or JNI; and need to support either basic block mode (byte array in, byte array out), or streaming code (InputStream in, OutputStream out).

Benchmark suite is based on Japex framework.

In addition to benchmark itself, we also provide access to set of benchmark results, which can be used for overview of general performance patterns for standard test suites. It is recommended, however, to run tests yourself since they vary depending on platform. In addition, to get more accurate understanding of how results apply to your use case(s), the best thing to do is to collect specific set of test data that reflects your usage, and run tests over this.

Codecs included

Currently following codecs are included in distribution:

  • LZF (block and streaming modes)
  • QuickLZ (block mode)
  • Gzip: JDK, JCraft (streaming mode)
  • Bzip2 from commons-compression (streaming mode)
  • Snappy (Java JNI wrapper over native Snappy) (block mode; streaming will be added soon)
  • LZMA by 7zip (block mode)
  • note: due to API impedance, full buffering is done; so implementation is bit sub-optimal. However, since this is a slow algorithm/codec (relatively speaking), its effects should not be drastic.

Since there are two basic compression modes (block mode, streaming mode), there are either one or two tests per codec.

In addition to codecs included, we are aware of other JVM codecs that we can not yet support (due to API or licensing); as well as codecs for which a JVM-accessible version may be forthcoming. These included

  • FastLZ: no Java version
  • LZMA by 7-zip: Java version exists, but API not streaming (only supports reading via InputStream AND writing to OutputStream; which works for encoding files but not for many other use cases)
  • [LZO](from http://www.oberhumer.com/opensource/lzo/): only Java decompressor, no compressor (test suite needs both, to generate compressed data for decompression)

Getting involved

To access source, just clone project: https://github.com/ning/jvm-compressor-benchmark

To participate in discussions of benchmark suite, results, and other things related to compression performance, please join our discussion group

Test data sets

Test data used

We have tried to make use of existing de-facto standard test suites, including:

Results

Here are some example results we have collected, to give an idea of what kind of performance to expect. Tests were run as single-threaded test on 2.5 GHz mini-Mac.

NOTE: although measurement have "TPS" in them, actual unit for second bar is "MB/sec"; this is an annoying Japex issue. "Size %" is correct, and indicates that the other measurement is for relative size of compressed result compared to original file size.

Clone this wiki locally