Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix wrong waveform and random crackles in N163 audio #294

Merged
merged 2 commits into from
Sep 26, 2024
Merged

Conversation

nyanpasu64
Copy link
Collaborator

@nyanpasu64 nyanpasu64 commented Sep 16, 2024

This pull request aims to fix N163 audio bugs I've discovered.

Changes in this PR:

  • Properly initialize Namco163Audio emulation core in constructor.
    • Fixes bug where if a N163 channel uses the Zxx effect to set wave position, the first note plays with corrupted sound until it switches waves or plays a new note.
  • Fix random crackles in N163 playback and WAV export.
    • This was caused by N163 Blip_Buffer frames desynchronizing from global frames after reset.

Background

N163 playback, like other chips, uses Blip_Synth and Blip_Buffer for resampling. Each sound chip has its own Blip_Synth, with an independent volume control and treble filter (a set of impulse responses).

Chip emulators create audio by calling Blip_Synth::update(time_clocks, amplitude). Blip_Synth calculates delta = amplitude - previous amplitude, then inserts an impulse of size delta into a Blip_Buffer. Blip_Synth performs resampling by converting time_clocks into fractional samples, then adding a fractionally delayed impulse to Blip_Buffer's difference array starting at int(sample timestamp). (This introduces a few samples of delay, from the impulse start to center point.)

Each frame lasts an integer number of clocks, but a fractional number of samples. At the end of a frame, Blip_Buffer::read_samples() finds the fractional sample where the next frame begins, generates audio up to int(sample timestamp) using a running sum of Blip_Buffer's difference array (with optional highpass DC removal), then outputs and discards int(sample timestamp) samples (keeping the fractional part of sample timestamp, so the next frame's impulses begin midway through the new first sample of the buffer).

Bug

Dn-FT outputs audio from a global Blip_Buffer, CMixer::BlipBuffer. In order to apply (approximately emulated) audio filtering, CN163 has its own Blip_Buffer, CN163::m_BlipN163.

When we begin playback, we call CAPU::Reset()CMixer::ClearBuffer() and CN163::Reset(), which both reset Blip_Buffer::offset_ to 0. When loading a document with a nonstandard tick rate or pressing F12, CSoundGen and CAPU's frames can desync, causing CAPU to get reset outside of CAPU::EndFrame() boundaries.

The bug in this case was that when FamiTracker called CN163::Reset() without ending the previous frame normally, CN163 failed to clear its "time since frame begin" field (CN163::m_iTime). Normally this would result in steps being delayed slightly for the next frame. But CN163 has its own Blip_Buffer, so each subsequent frame starts and stops at a different fractional sample than the global Blip_Buffer, resulting in different numbers of completed samples being generated each frame.

These frames of incorrect length get mixed into the global Blip_Buffer when CN163::EndFrame() calls Output.mix_samples_raw() (where Output points to CMixer::BlipBuffer). Whenever a N163 frame doesn't match the length of its corresponding global frame, it can produce 1-sample overlaps or gaps at frame borders, resulting in audio crackling.

Fix

Make CN163::Reset() set m_iTime = 0, so the first N163 frame adds samples at the correct time, has the correct length, and all subsequent frames have the correct phase relative to the global Blip_Buffer.

Does CN163 need to have a separate Blip_Buffer? It's complicated. It's used to perform a first-order lowpass filter at 12 kHz, but this is close enough to Nyquist (at the usual sampling rate of 44-48 kHz) to have noticeable frequency warping. I don't know if this is hardware-accurate or an attempt to filter out switching noise.

Fixes first note with Zxx playing with wrong waveform.
### Background

N163 playback, like other chips, uses `Blip_Synth` and `Blip_Buffer` for
resampling. Each sound chip has its own Blip_Synth, with an independent
volume control and treble filter (a set of impulse responses).

Chip emulators create audio by calling `Blip_Synth::update
(time_clocks, amplitude)`. Blip_Synth calculates `delta = amplitude -
previous amplitude`, then inserts an impulse of size `delta` into a
`Blip_Buffer`. `Blip_Synth` performs resampling by converting
`time_clocks` into fractional samples, then adding a fractionally
delayed impulse to Blip_Buffer's difference array starting at `int
(sample timestamp)`. (This introduces a few samples of delay, from the
impulse start to center point.)

Each frame lasts an integer number of clocks, but a fractional number of
samples. At the end of a frame, `Blip_Buffer::read_samples()` finds the
sample where the next frame begins, generates audio up to that sample
using a running sum of Blip_Buffer's difference array (with optional
highpass DC removal), then drops the returned samples (keeping the
fractional part of completed samples, so the next frame's impulses
begin midway through the new first sample of the buffer).

### Bug

Dn-FT outputs audio from a global Blip_Buffer, `CMixer::BlipBuffer`. In
order to apply (approximately emulated) audio filtering, `CN163` has
its own Blip_Buffer, `CN163::m_BlipN163`.

When we begin playback, we call `CMixer::ClearBuffer()` and
`CN163::Reset()`, which both reset `Blip_Buffer::offset_` to 0. Due to
how FamiTracker is designed, this can happen at random points
mid-frame (rather than on boundaries).

The bug in this case was that when FamiTracker called `CN163::Reset
()` without ending the previous frame normally, `CN163` failed to clear
its "time since frame begin" field (`m_iTime`). Normally this would
result in steps being delayed slightly for the next frame (or at
extremely low tick rates, writing out of bounds of Blip_Buffer's
difference array). But CN163 has its own Blip_Buffer, so each
subsequent frame starts and stops at a different fractional sample than
the global Blip_Buffer, resulting in different numbers of completed
samples being generated each frame.

These frames of incorrect length get mixed into the global Blip_Buffer
when `CN163::EndFrame()` calls `Output.mix_samples_raw()`
(where `Output` points to `CMixer::BlipBuffer`). Whenever a N163 frame
doesn't match the length of its corresponding global frame, it can
produce 1-sample overlaps or gaps at frame borders, resulting in audio
crackling.

### Fix

Make `CN163::Reset()` set `m_iTime = 0`, so the first N163 frame adds
samples at the correct time, has the correct length, and all subsequent
frames have the correct phase relative to the global Blip_Buffer.

Does CN163 need to have a separate Blip_Buffer? It's complicated. It's
used to perform a first-order lowpass filter at 12 kHz, but this is
close enough to Nyquist (at the usual sampling rate of 44-48 kHz) to
have noticeable frequency warping. I don't know if this is
hardware-accurate or an attempt to filter out switching noise.
@nyanpasu64 nyanpasu64 marked this pull request as ready for review September 22, 2024 04:55
@nyanpasu64 nyanpasu64 merged commit 375950f into main Sep 26, 2024
8 of 9 checks passed
@nyanpasu64 nyanpasu64 deleted the fix-n163-init branch September 26, 2024 08:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant