Skip to content
J. Tian edited this page Apr 13, 2022 · 2 revisions

This piece of documentation only previews the APIs to release and keeps a record. All the mentioned APIs are subject to change without any notification until the final release, 0.3 (where -rc* is dropped).

Essentials

  1. cuSZ assumes all the data are on device memory, and all the metadata are on host memory.
  2. For performance consideration,
    • cuSZ does not handle I/O, memory copy between spaces (unless internally), or communication and
    • cuSZ has one-time initialization, which could be expensive but amortized.
  3. After the initialization, the compressor could be used
    • for both compression and decompression, and
    • for compressing other data of the same ND size.

Configuration

  • Two data structures are involved to configure the compressor: cuszCTX ("context") and cuszHEADER ("header"). The former describes runtime configuration while the latter the file format.
  • The current implementation (as of 0.3-rc2) makes "context" a superset of "header".
  • The compression requires "context" only and the decompression requires both "context" and "header".

A "context" is made of string-based configuration in k-v pairs (k1=v1,k2=v2,...), where the delimiter of pairs is a comma (","). A "header" is stored and accessed during decompression.

| key            | value options                   | description                                               | required |    default    | CLI counterpart |
| -------------- | ------------------------------- | --------------------------------------------------------- | :------: | :-----------: | --------------- |
| len            | "[X]", "[X]x[Y]", "[X]x[Y]x[Z]" | ND length; also for allocation size by default            |    •     |       -       | -l [X]x[Y]x[Z]  |
| eb             | scientific notation             | error bound                                               |    •     |       -       | -e [EB]         |
| mode           | "abs", "r2r"                    | error-bounding mode                                       |    •     |       -       | -m [MODE]       |
| alloclen       | "[X]", "[X]x[Y]", "[X]x[Y]x[Z]" | overriding allocation length                              |          | same as "len" |                 |
| radius         | power-of-two integer            | quantization code coverage (single-side)                  |          |      512      | n/a             |
| pipeline       | "auto", "binary", "radius"      | whether to use sparsity-aware (binary) path               |          |    "auto"     | n/a             |
| anchor         | "on", "ON", "off", "OFF"        | whether to use anchor point                               |          |     "off"     | n/a             |
| huffbyte       | "4", "8"                        | override to use 8-byte internal type for VLE              |          |      "4"      | n/a             |
| nondestructive | "on", "ON", "off", "OFF"        | whether to overwrite the input data                       |          |               |                 |
| failfast       | "on", "ON", "off", "OFF"        | whether to fail or allocate more memory on out-of-memory  |          |               |                 |
| densityfactor  | number greater than 1           | override outlier gatherer (spcodec) reserved space factor |          |   "4" (25%)   | n/a             |

cuSZ API works based on two use scenarios.

API Use

First, data type is required before accessing compressor; then, we specify compressor by predictor type,

using Compressor = typename Framework<Data>::XFeaturedCompressor;

where X is substituted with a desired predictor from Lorenzo (ready), Spline3 (in progress). Alternatively, user can use `

using Compressor = typename Framework<Data>::DefaultCompressor;

Two types of configuration struct are involved: cusz::Context for compress-time and cusz::Header for decompress-time. For compress-time, a (C-)string is required to construct a cusz::Context instance.

char const* config_str = "len=3600x1800,eb=1e-4,mode=r2r";
auto ctx = new cusz::Context(config_str);

Alternatively, by separating numeric and string typed options,

auto ctx = new cusz::Context();
ctx->set_len(3600, 1800, 1, 1)        // In this case, the last 2 arguments can be omitted.
    .set_eb(2.4e-4)                   // numeric
    .set_control_string("mode=r2r");  // string

With data ready on device memory, we can perform compressiona and decompression. Please refer to the API definition below or the example code.

API Definition

The core compression and decompress APIs are defined as

/**
 * @brief Core compression API for cuSZ, requiring that input and output are on device pointers/iterators.
 *
 * @tparam Compressor predefined Compressor type, accessible via cusz::Framework<T>::XFeaturedCompressor
 * @tparam T uncompressed data type
 * @param compressor Compressor instance
 * @param config (host) cusz::Context as configuration type
 * @param uncompressed (device) input uncompressed type
 * @param uncompressed_alloc_len (host) for checking; >1.03x the original data size to ensure the legal memory access
 * @param compressed (device) exposed compressed array in Compressor (shallow copy); need to transfer before Compressor
 * destruction
 * @param compressed_len (host) output compressed array length
 * @param header (host) header for compressed binary description; aquired by a deep copy
 * @param stream CUDA stream
 * @param timerecord collected time information for compressor; aquired by a deep copy
 */
template <class Compressor, typename T>
void core_compress(
    Compressor *compressor, cusz::Context *config,
    T *uncompressed, size_t uncompressed_alloc_len,
    BYTE *&compressed, size_t &compressed_len, cusz::Header &header,
    cudaStream_t stream = nullptr,
    cusz::TimeRecord *timerecord = nullptr);


/**
 * @brief Core decompression API for cuSZ, requiring that input and output are on device pointers/iterators.
 *
 * @tparam Compressor predefined Compressor type, accessible via cusz::Framework<T>::XFeaturedCompressor
 * @tparam T uncompressed data type
 * @param compressor Compressor instance
 * @param config (host) cusz::Header as configuration type
 * @param compressed (device) input compressed array
 * @param compressed_len (host) input compressed length for checking
 * @param decompressed (device) output decompressed array
 * @param decompressed_alloc_len (host) for checking; >1.03x the original data size to ensure the legal memory access
 * @param stream CUDA stream
 * @param timerecord collected time information for compressor; aquired by a deep copy
 */
template <class Compressor, typename T>
void core_decompress(
    Compressor *compressor, cusz::Header *config,
    BYTE *compressed, size_t compressed_len,
    T *decompressed, size_t decompressed_alloc_len,
    cudaStream_t stream = nullptr,
    cusz::TimeRecord *timerecord = nullptr);             

Future Work

cusz::Framework will be expanded by supporting more types and selecting individual Codec and SpCodec, for example,

using Compressor = cusz::Framework<T>::CompressorTemplate<Predictor, Codec, SpCodec>;
Clone this wiki locally