-
Notifications
You must be signed in to change notification settings - Fork 109
Opus decompression
Libdragon unstable features a realtime Opus (CELT) decompressor that can be used for compressed audio. This page explains how to use it and give some technical details.
Opus compression is transparently integrated in the audioconv64
tool and the wav64
file format. Audioconv64 is libdragon's tool to convert input WAV files into libdragon's optimized wav64
file format. Through the command line option --wav-compress
, it is possible to select different compression levels for the wav64
file:
- Level 0: uncompressed audio. This should be used mainly only for debugging (eg if you feel there might be a bug in libdragon, you could try with uncompressed audio to see if it still triggers).
- Level 1: VADPCM (default). This is the default compression level, that uses a specialized ADPCM algorithm that is then efficiently implemented on the RSP for decompression. The compression ratio is 3.5:1, and it uses almost zero CPU time at runtime to decompress, so it makes sense to use it as default.
- Level 2: currently unused.
- Level 3: Opus. By selection this level, the file will be compressed using the Opus codec. Opus has a gazillion of possible configurations, so audioconv64 tries to select good defaults for you for most internal parameters. The expected compression ratio is around 15:1.
At runtime, all wav64 can be played back exactly in the same way, irrespective of the compression level. See the wav64.h
header file for an overview of the API, and the mixertest
example as a simple audio playback example.
To avoid exposing many knobs, audioconv64 will automatically select a compression ratio (aka bitrate) that linearly follows the input sample rate. The idea is that the user can play with compression ratio by reducing the input sample rate (and thus the input frequency bands). For instance, if you have a music file at 44 Khz and you want to experiment with compressing it more with audioconv64 itself:
$ $N64_INST/bin/audioconv64 --wav-compress 3 --wav-resample 22050 music.wav
In general, to obtain the highest possibile quality, provide the highest quality version of your audio file to audioconv64, and let it handle resampling by itself (because in the case of opus, that resampling is actually also part of the compression process, which produces a better result if it is fed the highest-quality file).
The opus decompression library has been RSP accelerated in several parts (mainly three areas: IMDCT/FFT, Comb Filter and Emphasis Filter). This table shows some benchmarks:
Song name | Duration | Channels | Rate | Raw size | Opus size | Opus CPU time |
---|---|---|---|---|---|---|
Octane | 19s | 2 | 48000 | 3.5 MiB | 232 KiB | 6969 us |
Octane | 19s | 2 | 32000 | 2.3 MiB | 157 KiB | 6543 us |
Octane | 19s | 2 | 24000 | 1.8 MiB | 119 KiB | 6210 us |
The opus format is built upon two different codecs: CELT and SILK. Simplifying, CELT is used for music and audio in general (so it is the default codec for wideband audio), while SILK is more specialized for speech (and thus narrowband audio). Opus is also able to actually mix CELT and SILK in this same audioframe. Libdragon's implementation of Opus only uses CELT. SILK is more resource intensive and only makes sense in specific use cases (speech at very low bitrates). A 16 Khz speech audio file compressed with CELT will still produce very good audio quality, while still providing a 16:1 compression ratio.
By default, Opus internally works by default at a sample rate of 48 Khz. Support for "custom modes" is available in the libopus codebase (though disabled by default), that allows for different sample rates. We activated such support for libdragon and we are experimenting with 32 Khz to reduce the resources, but currently this is not supported. If you inspect an opus-compressed wav64 file at runtime (via wav->wave.frequency
) you will see that it will always look like a 48 Khz file (or an integer decimated version of it: 24 Khz, 12 Khz, etc.). Don't worry about that though: it is an internal detail of how Opus works. The input sample rate was still took into account to decide the compression factor, so if you compress a 22 Khz file it will be about half as small and sound worse than a 44 Khz file, as expected.
Opus splits the input file in "frames" made of a fixed number of samples, and compress them separately. We use Opus in VBR (variable bitrate mode), which is the one that gives the best quality at any given file size; this in turns means that each frame will use a variable number of bytes in its compressed format. Currently, audioconv64 will always generate frames of exactly 20 ms of audio (960 samples, at the internal sample rate of 48 Khz), which is the default and suggested frame size for standard audio playback.