This page describes the binary and JSON data formats produced by the audiowaveform (1) program.
audiowaveform expects binary waveform data files to use the .dat
file extension. This format consists of a header block followed by the
actual waveform data. All values are little-endian.
The header block starts with a version field that identifies how the rest of the data in the file is structured.
Byte offset | Type | Field |
---|---|---|
0-3 | int32_t | Version |
The version 1 header is structured as follows:
Byte offset | Type | Field |
---|---|---|
4-7 | uint32_t | Flags |
8-11 | int32_t | Sample rate |
12-15 | int32_t | Samples per pixel |
16-19 | uint32_t | Length |
The version 2 header is structured as follows:
Byte offset | Type | Field |
---|---|---|
4-7 | uint32_t | Flags |
8-11 | int32_t | Sample rate |
12-15 | int32_t | Samples per pixel |
16-19 | uint32_t | Length |
20-23 | int32_t | Channels |
Each of these fields is described in detail below.
This field indicates the version number of the waveform data format. The version 1 and 2 data formats are described here. If the format changes in future, the Version field will be incremented.
The Flags field describes the format of the waveform data that follows the header.
Bit | Description |
---|---|
0 (lsb) | 0: 16-bit resolution, 1: 8-bit resolution |
1-31 | Unused |
Sample rate of the original audio file (Hz).
Number of audio samples per waveform minimum/maximum pair.
Length of waveform data (number of minimum and maximum value pairs per channel).
The number of waveform channels present (version 2 only).
Waveform data follows the header block and consists of pairs of minimum and maximum values that each represent a range of samples of the original audio (the "samples per pixel" header field).
The version 1 data format supports only a single audio channel; the audiowaveform program converts stereo audio to mono when generating waveform data. The version 2 data format supports multiple channels, where the data from each channel is interleaved.
For 8-bit data, the waveform data is represented as follows. Each value lies in the range -128 to +127. The example shows a two channel waveform data file.
Byte offset | Type | Value |
---|---|---|
20 | int8_t | Minimum sample value, index 0, channel 0 |
21 | int8_t | Maximum sample value, index 0, channel 0 |
22 | int8_t | Minimum sample value, index 0, channel 1 |
23 | int8_t | Maximum sample value, index 0, channel 1 |
24 | int8_t | Minimum sample value, index 1, channel 0 |
25 | int8_t | Maximum sample value, index 1, channel 0 |
26 | int8_t | Minimum sample value, index 1, channel 1 |
27 | int8_t | Maximum sample value, index 1, channel 1 |
etc | ... | ... |
Pairs of minimum and maximum values repeat to the end of the file.
For 16-bit data, the waveform data is represented as follows. Each value lies in the range -32768 to +32767. The example shows a two channel waveform data file.
Byte offset | Type | Value |
---|---|---|
20-21 | int16_t | Minimum sample value, index 0, channel 0 |
22-23 | int16_t | Maximum sample value, index 0, channel 0 |
24-25 | int16_t | Minimum sample value, index 0, channel 1 |
25-26 | int16_t | Maximum sample value, index 0, channel 1 |
27-28 | int16_t | Minimum sample value, index 1, channel 0 |
29-30 | int16_t | Maximum sample value, index 1, channel 0 |
31-32 | int16_t | Minimum sample value, index 1, channel 1 |
33-34 | int16_t | Maximum sample value, index 1, channel 1 |
etc | ... | ... |
Pairs of minimum and maximum values repeat to the end of the file.
The JSON data format contains the same information as the binary format.
audiowaveform expects JSON data files to use the .json
file extension.
The format consists of a single JSON object containing fields as described
below.
The version number of the waveform data format. The version 1 and 2 data formats are described here. If the format changes in future, the version field will be incremented.
The number of waveform channels present (version 2 only).
Sample rate of the original audio file (Hz).
Number of audio samples per waveform minimum/maximum pair.
Resolution of waveform data. May be either 8 or 16.
Length of waveform data (number of minimum and maximum value pairs per channel).
Array of minimum and maximum waveform data points, interleaved.
The example shows a two channel waveform data file.
Depending on bits
, each value may be in the range -128 to +127
or -32768 to +32767.
Array offset | Value |
---|---|
0 | Minimum sample value, index 0, channel 0 |
1 | Maximum sample value, index 0, channel 0 |
2 | Minimum sample value, index 0, channel 1 |
3 | Maximum sample value, index 0, channel 1 |
4 | Minimum sample value, index 1, channel 0 |
5 | Maximum sample value, index 1, channel 0 |
6 | Minimum sample value, index 1, channel 1 |
7 | Maximum sample value, index 1, channel 1 |
etc | ... |
The following is an example of a (very short) waveform data file in JSON format.
{
"version": 2,
"channels": 2,
"sample_rate": 48000,
"samples_per_pixel": 512,
"bits": 8,
"length": 3,
"data": [-65,63,-66,64,-40,41,-39,45,-55,43,-55,44]
}