Skip to content
Giovanni Bajo edited this page Jan 11, 2024 · 12 revisions

Libdragon unstable features a realtime Opus (CELT) decompressor that can be used for compressed audio. This page explains how to use it and give some technical details.

How to compress a WAV file

Opus compression is transparently integrated in the audioconv64 tool and the wav64 file format. Audioconv64 is libdragon's tool to convert input WAV files into libdragon's optimized wav64 file format. Through the command line option --wav-compress, it is possible to select different compression levels for the wav64 file:

  • Level 0: uncompressed audio. This should be used mainly only for debugging (eg if you feel there might be a bug in libdragon, you could try with uncompressed audio to see if it still triggers).
  • Level 1: VADPCM (default). This is the default compression level, that uses a specialized ADPCM algorithm that is then efficiently implemented on the RSP for decompression. The compression ratio is 3.5:1, and it uses almost zero CPU time at runtime to decompress, so it makes sense to use it as default.
  • Level 2: currently unused.
  • Level 3: Opus. By selection this level, the file will be compressed using the Opus codec. Opus has a gazillion of possible configurations, so audioconv64 tries to select good defaults for you for most internal parameters. The expected compression ratio is around 15:1.

How to playback

At runtime, all wav64 can be played back exactly in the same way, irrespective of the compression level. See the wav64.h header file for an overview of the API, and the mixertest example as a simple audio playback example.

How to tune the quality and compression ratio

To avoid exposing many knobs, audioconv64 will automatically select a compression ratio (aka bitrate) that linearly follows the input sample rate. The idea is that the user can play with compression ratio by reducing the input sample rate (and thus the input frequency bands). For instance, if you have a music file at 44 Khz and you want to experiment with compressing it more with audioconv64 itself:

$ $N64_INST/bin/audioconv64 --wav-compress 3 --wav-resample 22050 music.wav

In general, to obtain the highest possibile quality, provide the highest quality version of your audio file to audioconv64, and let it handle resampling by itself (because in the case of opus, that resampling is actually also part of the compression process, which produces a better result if it is fed the highest-quality file).

Benchmarks

The opus decompression library has been RSP accelerated in several parts (mainly three areas: IMDCT/FFT, Comb Filter and Emphasis Filter). This table shows some benchmarks:

Song name Duration Channels Rate Raw size Opus size Opus CPU time
Octane 19s 2 48000 3.5 MiB 232 KiB 6969 us
Octane 19s 2 32000 2.3 MiB 157 KiB 6543 us
Octane 19s 2 24000 1.8 MiB 119 KiB 6210 us

Technical details

The opus format is built upon two different codecs: CELT and SILK. Simplifying, CELT is used for music and audio in general (so it is the default codec for wideband audio), while SILK is more specialized for speech (and thus narrowband audio). Opus is also able to actually mix CELT and SILK in this same audioframe. Libdragon's implementation of Opus only uses CELT. SILK is more resource intensive and only makes sense in specific use cases (speech at very low bitrates). A 16 Khz speech audio file compressed with CELT will still produce very good audio quality, while still providing a 16:1 compression ratio.

By default, Opus internally works by default at a sample rate of 48 Khz. Support for "custom modes" is available in the libopus codebase (though disabled by default), that allows for different sample rates. We activated such support for libdragon and we are experimenting with 32 Khz to reduce the resources, but currently this is not supported. If you inspect an opus-compressed wav64 file at runtime (via wav->wave.frequency) you will see that it will always look like a 48 Khz file (or an integer decimated version of it: 24 Khz, 12 Khz, etc.). Don't worry about that though: it is an internal detail of how Opus works. The input sample rate was still took into account to decide the compression factor, so if you compress a 22 Khz file it will be about half as small and sound worse than a 44 Khz file, as expected.

Opus splits the input file in "frames" made of a fixed number of samples, and compress them separately. We use Opus in VBR (variable bitrate mode), which is the one that gives the best quality at any given file size; this in turns means that each frame will use a variable number of bytes in its compressed format. Currently, audioconv64 will always generate frames of exactly 20 ms of audio (960 samples, at the internal sample rate of 48 Khz), which is the default and suggested frame size for standard audio playback.

Clone this wiki locally