Skip to content

Latest commit

 

History

History
656 lines (494 loc) · 24.1 KB

core-vfd.org

File metadata and controls

656 lines (494 loc) · 24.1 KB

Using Memory-Backed HDF5 Files to Reduce Storage Access and Size

Introduction

Logically, an HDF5 file stored in a file system is just a sequence of bytes formatted according to the HDF5 file format specification. Several factors affect the performance with which such a sequence can be manipulated, including (but not limited to) the latency and throughput of the underlying storage hardware. Memory-backed HDF5 files eliminate this constraint by maintaining the underlying byte sequence in contiguous RAM buffers.

The life-cycle of memory-backed HDF5 files is very simple:

  1. Instruct the HDF5 library to create an HDF5 file in memory rather than in a file system. (It is also possible to load existing HDF5 files into memory buffers.)
  2. For the lifetime of the application perform HDF5 I/O operations as usual.
  3. Before application shutdown decide whether to abandon the RAM buffer or to persist certain parts of it in an HDF5 file in a file system.

    The obvious use case for in-memory HDF5 files is their use as I/O buffers, for example, in applications with large numbers of small-size and random-access I/O operations. A second, often overlooked, use case is the ability to deal with a high degree of uncertainty, especially, in application backends. The idea is that by maintaining a memory-backed shadow HDF5 file for such highly uncertain candidates, the decision about what to put into a persisted (in storage) HDF5 file can often be delayed, resulting in faster access and smaller files.

    The vehicle for implementing this behavior in the HDF5 library is the virtual file layer (VFL), specifically, the so-called core virtual file driver (VFD). This is, however, not the place to learn about implementation details. See ??? for that.

    This document is organized as follows: First, we present three examples (sections sec:basic-use, sec:copying-objects, sec:delaying-decisions) with their code outlines and their actual behavior. In section sec:building-blocks, we show the implementation of their (common) building blocks.

Basic Use <sec:basic-use>

The HDF5 library supports an in-memory, aka. in-/core/, representation of HDF5 files, which should not to be confused with memory-mapped files. Unlike files in a file system, which are represented by inodes and ultimately blocks on a block device, memory-backed HDF5 files are just contiguous memory regions (“buffers”) formatted according to the HDF5 file format specification. Since they are HDF5 files, the API for such “files” is identical to the one for “regular” (=on-disk) HDF5 files.

The purpose of the first example is to show that working with memory-backed HDF5 files is as straightforward as working with HDF5 files in a file system.

Goal

We would like to write an array of integers to a 2D dataset in a memory-backed HDF5 file.

Before we look at memory-backed HDF5 files, let’s recap the steps for ordinary HDF5 files!

Outline for an HDF5 file in a file system

Given our important array data, we:

  1. Create an HDF5 file, disk.h5
  2. Create a suitably sized and typed dataset /2x3, and write data
  3. Close the dataset
  4. Close the file, after printing the HDF5 library version and the file size on-disk

    **Note:** The paragraph following the outline shows the actual program output.

#include "literate-hdf5.h"

int main(int argc, char** argv)
{
  int data[] = {0, 1, 2, 3, 4, 5};
  hid_t file = (*
                <<make-disk-file>> ) ("disk.h5"); // (ref:vfd0-blk0)
  hid_t dset = (*
                <<make-2D-dataset>> ) (file, "2x3", H5T_STD_I32LE, // (ref:vfd0-blk1)
                                       (hsize_t[]){2,3}, data);
  H5Dclose(dset);

  (*
   <<print-lib-version>> ) ();
  (*
   <<print-file-size>> ) (file);

  H5Fclose(file);

  return 0;
}

The <<make-disk-file>> block (line (vfd0-blk0)) is merely a call to H5Fcreate (see section sec:disk-file-creation) and the <<make-2D-dataset>> block (line (vfd0-blk1)) is a call to H5Dcreate with all the trimmings (see section sec:dataset-creation).

Outline for a memory-backed HDF5 file

The outline for memory-backed HDF5 files is almost identical to on-disk files. The <<make-mem-file>> block on line (mem-file-creation) has two additional arguments (see section sec:mem-file-creation). The first is the increment (in bytes) by which the backing memory buffer will grow, should that be necessary. In this example, it’s 1 MiB. The third parameter, a flag, controls if the memory-backed file is persisted in storage after closing. Any argument passed to the executable will be interpreted as TRUE and the file persisted. By default (no arguments), there won’t be a core.h5 file after running the program.

#include "literate-hdf5.h"

int main(int argc, char** argv)
{
  int data[] = {0, 1, 2, 3, 4, 5};
  hid_t file = (*
                <<make-mem-file>> ) ("core.h5", 1024*1024, (argc > 1)); // (ref:mem-file-creation)
  hid_t dset = (*
                <<make-2D-dataset>> ) (file, "2x3", H5T_STD_I32LE,
                                       (hsize_t[]){2,3}, data);
  H5Dclose(dset);

  (*
   <<print-lib-version>> ) ();
  (*
   <<print-file-size>> ) (file);

  H5Fclose(file);

  return 0;
}

The only difference between the on-disk and the memory-backed version is line (mem-file-creation), which shows that

  1. We are dealing with HDF5 files after all.
  2. The switch to memory-backed HDF5 files requires only minor changes of existing applications.

    See section sec:mem-file-creation for the implementation of <<make-mem-files>>.

Discussion

When running the executable core-vfd1 for the memory-backed HDF5 file, we are informed that, for HDF5 library version 1.13.0, the (in-memory) file has a size of 1,048,576 bytes (1 MiB). However, the dataset itself is only about 24 bytes (=six times four bytes plus metadata). Since we told the core VFD to grow the file in 1 MiB increments that’s the minimum allocation.

Running the program with any argument will persist the memory-backed HDF5 file as core.h5. Surprisingly, that file is only 2072 bytes (for HDF5 1.13.0). The reason is that the HDF5 library truncates and eliminates any unused space in the memory-backed HDF5 file before closing it.

**Bottom line:** Memory-backed HDF5 files are as easy to use as HDF5 files in file systems.

Copying Objects <sec:copying-objects>

We can copy HDF5 objects such as groups and datasets inside the same HDF5 file or across HDF5 files. A common scenario is to use a memory-backed HDF5 file as a scratch space (or RAM disk) and, before closing it, to store only a few selected objects of interest in an on-disk HDF5 file.

Goal

We would like to copy a dataset from a memory-backed HDF5 file to an HDF5 file stored in a file system.

Outline

In this example, we are working with two HDF5 files, one memory-backed and the other in a file system. We re-use the file creation building blocks (lines (copy-file1), (copy-file2)) and the dataset creation building block (line (copy-mdset)) to create a dataset dset_m in the memory-backed HDF5 file file_m. Fortunately, the HDF5 library provides a function, H5Ocopy, for copying HDF5 objects between HDF5 files. All we have to do is call it on line (copy-call).

#include "literate-hdf5.h"

int main(int argc, char** argv)
{
  int data[] = {0, 1, 2, 3, 4, 5};

  hid_t file_d = (*
                  <<make-disk-file>> ) ("disk.h5"); // (ref:copy-file1)
  hid_t file_m = (*
                  <<make-mem-file>> ) ("core.h5", 4096, (argc > 1)); // (ref:copy-file2)
  hid_t dset_m = (*
                  <<make-2D-dataset>> ) (file_m, "2x3", H5T_STD_I32LE, // (ref:copy-mdset)
                                         (hsize_t[]){2,3}, data);
  H5Dclose(dset_m);

  (*
   <<print-lib-version>> ) ();
  (*
   <<print-file-size>> ) (file_m);

  H5Ocopy(file_m, "2x3", file_d, "2x3copy", H5P_DEFAULT, H5P_DEFAULT); // (ref:copy-call)

  H5Fclose(file_m);

  (*
   <<print-file-size>> ) (file_d);

  H5Fclose(file_d);

  return 0;
}

Discussion

When running the program core-vfd2, we are informed that, for HDF5 library version 1.13.0, both files have a size of 4 KiB. That is a coincidence of two independent factors: Firstly, in line (copy-file2), we instructed the HDF5 library to grow the memory-backed HDF5 file in 4 KiB increments, and one increment is plenty to accommodate our small dataset. Secondly, the 4 KiB size of the disk.h5 file is due to paged allocation with 4 KiB being the default page size. (Really?)

**Bottom line:** Transferring objects or parts of a hierarchy from a memory-backed HDF5 file to another HDF5 file, be it in a file system or another memory-backed file, is easy thanks to H5Ocopy!

Delaying Decisions <sec:delaying-decisions>

The developers and maintainers of certain application types, for example, data persistence back-ends of interactive applications, face specific challenges which stem from the uncertainty over the particular course of action(s) their users take as part of a transaction or over the duration of a session. Ideally, any decisions that amount to commitments not easily undone later can be postponed or delayed until a better informed decision can be made.

As stated earlier, when creating new objects, the HDF5 library needs certain information (e.g., creation properties) which stays with an object throughout its lifetime and which is immutable. The copy approach from the previous example won’t work, because it preserves HDF5 objects’ creation properties. Still, a memory-backed HDF5 “shadow” file can be used effectively alongside other HDF5 files as a holding area for objects whose final whereabouts are uncertain at object creation time.

Goal

We would like to maintain a potentially very large 2D dataset in a memory-backed HDF5 file and eventually persist it to an HDF5 file in a file system.

Outline

There are a few new snippets in this example. The <<make-big-2D-dataset>> block on line (big-dset) appears identical to <<make-2D-dataset>>, but the implementation in section sec:big-dataset-creation shows that we are dealing with a datset of potentially arbitrary extent, using chunked storage layout.

Between lines (uncert1) and (uncert2), we mimic the uncertainty around its extent during an application’s lifetime by growing and shrinking it using H5Dset_extent.

On line (size-check), we check its size once more (see section sec:dataset-size). If the size doesn’t exceed 60,000 bytes, we optimize its persisted representation by using the so-called compact storage layout (line (compact) and section sec:compact-replica). In this case we need to transfer the data manually (line (data-xfer) and section sec:dataset-xfer). Otherwise, we fall back onto H5Ocopy (line (big-copy)).

#include "literate-hdf5.h"

int main(int argc, char** argv)
{
  int data[] = {0, 1, 2, 3, 4, 5};
  hid_t file_d = (*
                  <<make-disk-file>> ) ("disk.h5");
  hid_t file_m = (*
                  <<make-mem-file>> ) ("core.h5", 1024*1024, (argc > 1));
  hid_t dset_m = (*
                  <<make-big-2D-dataset>> ) (file_m, "2x3", // (ref:big-dset)
                                             H5T_NATIVE_INT32,
                                             (hsize_t[]){2,3}, data);
  (*
   <<print-lib-version>> ) ();
  (*
   <<print-file-size>> ) (file_m);

  { /* UNCERTAINTY */
    H5Dset_extent(dset_m, (hsize_t[]){200,300}); // (ref:uncert1)

    H5Dset_extent(dset_m, (hsize_t[]){200000,300000});

    H5Dset_extent(dset_m, (hsize_t[]){2,3}); // (ref:uncert2)
  }

  if ((*
       <<dataset-size>>) (dset_m) < 60000) // (ref:size-check)
    {
      hid_t dset_d = (*
                      <<create-compact>> ) (dset_m, file_d, "2x3copy"); // (ref:compact)
      (*
       <<xfer-data>> ) (dset_m, dset_d); // (ref:data-xfer)

      H5Dclose(dset_d);
    }
  else
    {
      H5Ocopy(file_m, "2x3", file_d, "2x3copy", H5P_DEFAULT, H5P_DEFAULT); // (ref:big-copy)
    }

  H5Dclose(dset_m);
  H5Fclose(file_m);

  (*
   <<print-file-size>> ) (file_d);

  H5Fclose(file_d);

  return 0;
}

Discussion

When running the program core-vfd3, we are informed that, for HDF5 library version 1.13.0, the memory-backed HDF5 file has a size of over 4 MiB while the persisted file is just 2 KiB.

As can be seen in section sec:big-dataset-creation, the chunk size chosen for the /2x3 dataset is 4 MiB. Although we are writing only six 32-bit integer (24 bytes), a full 4 MiB chunk needs to be allocated, which explains the overall size for the memory-backed HDF5 file.

The compact storage layout is particularly storage- and access-efficient: the dataset elements are stored as part of the dataset’s object header (metadata). This header is read whenever the dataset is opened, and the dataset elements “travel along for free”, which means that there is no separate storage access necessary for subsequent read or write operations.

**Bottom line:** The use of memory-backed HDF5 files can lead to substantial storage and access performance improvements, if applications “keep their cool” and do not prematurely commit storage resources to HDF5 objects.

Building Blocks <sec:building-blocks>

On-disk HDF5 file creation <sec:disk-file-creation>

H5Fcreate has four parameters, of which the first two, file name and access flag, are usally in the limelight. To create an on-disk HDF5 file is as easy as this:

lambda(hid_t, (const char* name),
       {
         return H5Fcreate(name, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);
       })

The third and the fourth parameter, a file creation and a file access property list (handle), unlock a few extra treats, as we will see in a moment.

In-memory HDF5 file creation <sec:mem-file-creation>

We use the fourth parameter of H5Fcreate, a file access property list, to do the in-memory magic.

lambda(hid_t, (const char* name, size_t increment, hbool_t flg),
       {
         hid_t retval;
         hid_t fapl = H5Pcreate(H5P_FILE_ACCESS);

         H5Pset_fapl_core(fapl, increment, flg); // (ref:fapl-core)

         retval = H5Fcreate(name, H5F_ACC_TRUNC, H5P_DEFAULT, fapl);
         H5Pclose(fapl);
         return retval;
       })

That’s right, a suitably initialized property list (line (fapl-core)) makes all the difference. This is in fact the ONLY difference between an application using regular vs. memory-backed HDF5 files.

Dataset creation <sec:dataset-creation>

To create a dataset, we must specify a name, its element type dtype, its shape dims, and, optionally, an inital value buffer. Without additional customization, the default dataset storage layout is H5D_CONTIGUOUS, i.e., the (fixed-size) dataset elements are layed out in a contigous (memory or storage) region.

lambda(hid_t,
       (hid_t file, const char* name, hid_t dtype, const hsize_t* dims, void* buffer),
       {
         hid_t retval;
         hid_t dspace = H5Screate_simple(2, dims, NULL);

         retval = H5Dcreate(file, name, dtype, dspace, // (ref:dset-dtype1)
                            H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);

         if (buffer)
           H5Dwrite(retval, dtype, H5S_ALL, H5S_ALL, H5P_DEFAULT, buffer); // (ref:dset-dtype2)

         H5Sclose(dspace);
         return retval;
       })

**WARNING:** This snippet contains an important assumption that may not be obvious to many readers: The datatype handle dtype is used in two places with different interpretations. In the first instance, line (dset-dtype1), it refers to the in-file element type of the dataset to be created. In the second instance, line (dset-dtype2), it refers to the datatype of the elements in buffer. The assumption is that the two are the same. While this assumption is valid in many practical examples, it can lead to subtle errors if its violation goes undetected. In a production code, this should be either documented and enforced, or an additional datatype argument be passed to distinguish them.

Print library and file info

lambda(void, (void),
       {
         unsigned majnum;
         unsigned minnum;
         unsigned relnum;
         H5get_libversion(&majnum, &minnum, &relnum);
         printf("HDF5 library version %d.%d.%d\n", majnum, minnum, relnum);
       })
lambda(void, (hid_t file),
       {
         hsize_t size;
         H5Fget_filesize(file, &size);
         printf("File size: %ld bytes\n", size);
       })

Big dataset creation <sec:big-dataset-creation>

This lambda returns a handle to the potentially large dataset in the memory-backed HDF5 file. Since the dataset’s final size will only be known eventually (e.g., end of epoch or transaction), we can’t impose a finite maximum extent. On line (big-sky), we set the maxmimum extent as unlimited in all (2) dimensions. Currently, the only HDF5 storage layout that supports such an arrangement is chunked storage layout. By passing a non-default dataset creation property list dcpl to H5Dcreate (line (big-dcpl)), we instruct the HDF5 library to use chunked storage layout instead of the default contiguous layout. For chunked layout, we must specify the size of an individual chunk in terms of dataset elements per chunk; see line (big-chunk). The size of a chunk in bytes depends on the element datatype. In our example (32-bit integers), a 1024^2 chunk occupies 4 MiB of memory or storage.

lambda(hid_t,
       (hid_t file, const char* name, hid_t dtype, const hsize_t* dims, void* buffer),
       {
         hid_t retval;
         hid_t dspace = H5Screate_simple(2, dims,
                                         (hsize_t[]){H5S_UNLIMITED, H5S_UNLIMITED}); // (ref:big-sky)
         hid_t dcpl = H5Pcreate(H5P_DATASET_CREATE);

         H5Pset_chunk(dcpl, 2, (hsize_t[]){1024, 1024}); // (ref:big-chunk)
         retval = H5Dcreate(file, name, dtype, dspace, // (ref:big-dtype1)
                            H5P_DEFAULT, dcpl, H5P_DEFAULT); // (ref:big-dcpl)

         if (buffer)
           H5Dwrite(retval, dtype, H5S_ALL, H5S_ALL, H5P_DEFAULT, buffer); // (ref:big-dtype2)

         H5Pclose(dcpl);
         H5Sclose(dspace);
         return retval;
       })

The same warning and assumptions expressed at the end of section sec:dataset-creation apply to dtype.

Dataset size <sec:dataset-size>

This lambda returns the size (in bytes) of the source dataset in the memory-backed HDF5 file. It’s a matter of determining the storage size of an individual dataset element and counting how many there are (lines (size-calc1), (Size-calc2))

lambda(hid_t, (hid_t dset),
       {
         size_t retval;
         hid_t ftype = H5Dget_type(dset);
         hid_t dspace = H5Dget_space(dset);

         retval = H5Tget_size(ftype) *  // (ref:size-calc1)
           (size_t) H5Sget_simple_extent_npoints(dspace); // (ref:size-calc2)

         H5Sclose(dspace);
         H5Tclose(ftype);
         return retval;
       })

Compact replica <sec:compact-replica>

This lambda returns a handle to the freshly minted compact replica of the source dataset. (It’s a placeholder, because the actual values are transferred separately.)

What sets this dataset creation apart from the default case occurs on lines (compact-layout)-(compact-dcpl). By passing a non-default dataset creation property list dcpl to H5Dcreate, we instruct the HDF5 library to use compact storage layout instead of the default contiguous (H5D_CONTIGUOUS) layout.

lambda(hid_t, (hid_t src_dset, hid_t file, const char* name),
       {
         hid_t retval;
         hid_t ftype = H5Dget_type(src_dset);
         hid_t src_dspace = H5Dget_space(src_dset); // (ref:compact-src)
         hid_t dcpl = H5Pcreate(H5P_DATASET_CREATE);

         hid_t dspace = H5Scopy(src_dspace); // (ref:compact1)
         hsize_t dims[H5S_MAX_RANK]; // (ref:compact-max-rank)
         H5Sget_simple_extent_dims(dspace, dims, NULL);
         H5Sset_extent_simple(dspace, H5Sget_simple_extent_ndims(dspace),
                              dims, NULL); // (ref:compact2)

         H5Pset_layout(dcpl, H5D_COMPACT); // (ref:compact-layout)
         retval = H5Dcreate(file, name, ftype, dspace,
                            H5P_DEFAULT, dcpl, H5P_DEFAULT); // (ref:compact-dcpl)

         H5Pclose(dcpl);
         H5Sclose(dspace);
         H5Tclose(ftype);
         return retval;
       })

Two other things are worth mentioning about this snippet.

  1. The dataspace construction on lines (compact1)-(compact2) appears a little clumsy. Since the extent of the source dataset src_dset is not changing, why not just work with src_dspace (line (compact-src))? The reason is that dataspaces with H5S_UNLIMITED extent bounds, for obvious reasons, are not supported with compact layout. In that case, in our example (!), passing src_dspace as an argument to H5Dcreate would generate an error. It’s easier to just create a copy of the dataspace and “kill” (NULL) whatever maximum extent there might be.
  2. On line (compact-max-rank), we use the HDF5 library macro H5S_MAX_RANK to avoid the dynamic allocation of the dims array.

Data transfer <sec:dataset-xfer>

The HDF5 library does not currently have a function to “automagically” transfer data between two datasets, especially datasets with different storage layouts. There is not much else we can do but to read (line (xfer-read)) the data from the source dataset and to write (line (xfer-write)) to the destination dataset.

Since we transfer the data through memory, we need to determine first the size of the transfer buffer needed (line (xfer-size)).

lambda(void, (hid_t src, hid_t dst),
       {
         hid_t ftype = H5Dget_type(src);
         hid_t dspace = H5Dget_space(src);
         size_t size = H5Tget_size(ftype) * H5Sget_simple_extent_npoints(dspace); // (ref:xfer-size)
         char* buffer = (char*) malloc(size);

         H5Dread(src, ftype, H5S_ALL, H5S_ALL, H5P_DEFAULT, buffer); // (ref:xfer-read)
         H5Dwrite(dst, ftype, H5S_ALL, H5S_ALL, H5P_DEFAULT, buffer); // (ref:xfer-write)

         free(buffer);
         H5Sclose(dspace);
         H5Tclose(ftype);
       })