This is a benchmark that covers many widely distributed libraries and functions, which consume a considerable amount of CPU cycles in the datacenter.
WDLBench and its job information are located in separate files. To use it (install, run, list, etc.), please always specify the path:
./benchpress_cli.py -b wdl install|run|list|others ...
./benchpress_cli.py -b wdl install folly_single_core
As of 2024 Q3, we have three libraries included -- folly
, lzbench
, and openssl
,
each can run on a single core or all cores.
for folly
, the user can choose to run them all together (i.e., run all microbenchmarks with one DCPerf run) or individually (i.e., one microbenchmark per DCPerf run) with different jobs.
./benchpress_cli.py -b wdl run folly_single_core|folly_all_core|folly_multi-thread
./benchpress_cli.py -b wdl run folly_individual -i '{"name": "function_name"}'
for list of functions to run then individually, see list of benchmarks in folly.
For lzbench
and openssl
, the user can pass parameters to select how to run them.
./benchpress_cli.py -b wdl run lzbench -i '{"type": "single_core|all_core"}'
./benchpress_cli.py -b wdl run openssl -i '{"type": "single_core|all_core"}'
For lzbench
and openssl
, the user can also pass the algo
parameter to specify the algorithm used, for lzbench
, the default algorithm is zstd
, while for openssl
, the default algorithm is ctr
(aes-256-ctr
).
name | Description | catagories |
concurrency_concurrent_hash_map_benchmark | multiple common operations of the folly::ConcurrentHashMap data structure | multi-thread (locks, mutex, etc.) |
stats_digest_builder_benchmark | append operations to a single DigestBuilder buffer from multiple threads | multi-thread (locks, mutex, etc.) |
event_base_benchmark | tests on and off speed of EventBase class, a wrapper of all async I/O processing functionalities | single_core |
fibers_fibers_benchmark | multiple common operations of FiberManager, which allows semi-parallel task execution on the same thread | single_core |
function_benchmark | evaluates function call performance | single_core |
hash_hash_benchmark | evaluates speed of three hash functions: SpookyHashV2, FNV64, and MurmurHash | single_core |
hash_maps_bench | multiple common operations of the F14 map data structure | single_core |
iobuf_benchmark | multiple common operations of IOBuf, which manages heap-allocated byte buffers. | single_core |
lt_hash_benchmark | evaluates speed of the lt hash function, which is common in crypto | single_core, all_core |
memcpy_benchmark | measures and compares memcpy from glibc and folly on vairous sizes | single_core, all_core |
memset_benchmark | measures and compares memset from glibc and folly on vairous sizes | single_core, all_core |
random_benchmark | evaluates speed of various random number generation functions | single_core, all_core |
small_locks_benchmark | evaluates performance of multi-thread locks, mutex, atomic operations, etc. | multi-thread (locks, mutex, etc.) |
ProtocolBench | evaluates performance of various thrift RPC protocol operations | single_core, all_core |
For now, for each benchmark, we report the results in the out_name.json
file in benchmark_metrics_<uuid>
folder. In the JSON file,
the keys are the items run in the benchmark, and the values are the corresponding performance
numbers (typically throughput (iterations per second)).
In future, we plan to add reference performance numbers of each benchmark as baseline, and DCPerf can automatically compare the performance of your run against the default reference run.