Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Automatic Gain Control #621

Open
wants to merge 20 commits into
base: master
Choose a base branch
from

Conversation

UnknownSuperficialNight
Copy link

@UnknownSuperficialNight UnknownSuperficialNight commented Sep 26, 2024

Implemented automatic gain control (AGC)

I implemented a to_f32 conversion for the sample as it was missing. I'm not entirely certain if this is optimal, so I welcome feedback on the implementation.

This pull request is in relation to. Closes #620

Here is my main.rs and Cargo.toml i used to test while developing:

use rodio::source::Source;
use rodio::{Decoder, OutputStream, Sink};
use std::fs::File;
use std::io::BufReader;

fn main() {
    // Get a output stream handle to the default physical sound device
    let (_stream, stream_handle) = OutputStream::try_default().unwrap();
    // Create a sink
    let sink = Sink::try_new(&stream_handle).unwrap();

    // Load a sound from a file, using a path relative to Cargo.toml
    let file =
        BufReader::new(File::open(env!("CARGO_MANIFEST_DIR").to_owned() + "/OOTW.mp3").unwrap());
    // Decode that sound file into a source
    let source = Decoder::new(file).unwrap();

    // Amplify the source
    let amplified_source = source.automatic_gain_control(1.0, 0.3, 3.0);

    // Add the amplified source to the sink
    sink.append(amplified_source);

    // The sound plays in a separate thread.
    sink.sleep_until_end();
}
[workspace]
members = [".", "rodio"]

[package]
name = "test_area"
version = "0.1.0"
edition = "2021"

[dependencies]
rodio = { path = "rodio" }

Id suggest an attack_time of around 0.5 to 2.5 that seems to be a sweet spot for songs I tested.
Also, target_level I adjusted to be 0.9 just to make sure it does not clip the audio; it was clipping for me at 1.0.

Don't know where to write documentation or if I should be the one writing it. Idk, let me know; thanks :)

@dvdsk
Copy link
Collaborator

dvdsk commented Sep 26, 2024

Since I do not feel fully qualified to review the algorithmic side of this, I am going to ask the wider rodio community for help in reviewing. I will also read up on AGC and see if I can find any problem with the implementation.

  • @iluvcapra you are working on synthesizers and those looked pretty complicated, would you be willing to give this a look too?

src/source/agc.rs Outdated Show resolved Hide resolved
self.current_gain = self.current_gain.clamp(0.1, self.absolute_max_gain);

// Uncomment for debugging:
println!("Current gain: {}", self.current_gain);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think we can do without these now :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ill leave them in for now so myself and others can test until everything is set in stone, then remove them before the final merge if thats ok?

Copy link
Collaborator

@dvdsk dvdsk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very readable thanks! I have some minor things here and there. The algorithm makes sense to me though ill need to read up on AGC and look closer tomorrow. Did you come up with it yourself or did you find it somewhere? In either case we should attribute it in a comment somewhere.

Regarding the to_32. That seems like a fine solution to me. I am wondering though would the algorithm still work if you had only i16? We could implement subtract, add multiply_by_f32, div_by_usize and modulo for Sample. There are trait in the std for all those I think.

That might save some conversions to f32, and in doing so speed up the implementation.

@UnknownSuperficialNight
Copy link
Author

UnknownSuperficialNight commented Sep 26, 2024

I am wondering though would the algorithm still work if you had only i16? We could implement subtract, add multiply_by_f32, div_by_usize and modulo for Sample. There are trait in the std for all those I think.

Ill look into it, but probably would need other people's input on what would be best to do there. It could be a premature optimization, but if it works, then that's a bonus.

Thanks for the suggestion :)

Did you come up with it yourself or did you find it somewhere?

I put this together myself, but I definitely used some algorithms and ideas I found online. So while I wrote the code, I built on what others have done before.

By algorithms i mean mathematical ones

In either case we should attribute it in a comment somewhere.

Where?

- Implement asymmetric attack/release
- Introduce MIN_ATTACK_TIME limit to prevent AGC instability
- Clamp attack_time to prevent instability
- Faster decrease, slower increase for smoother sound
- Safeguard against extreme gain fluctuations
@UnknownSuperficialNight
Copy link
Author

UnknownSuperficialNight commented Sep 27, 2024

Integrated it with my GUI earlier, and it's working really well—Tested on a -10db audio file. Here’s a video showing the results:

(Left is using sink without AGC)
(Right is using sink with AGC)

Video:

2024-09-27.20-00-04.mp4

Let me know what you think of the new comment style and whether it is okay and descriptive enough.

@dvdsk
Copy link
Collaborator

dvdsk commented Sep 27, 2024

It could be a premature optimization

you got a good point there. It might very well be, its a real bummer rodio has no benchmarks/profile runs setup. We should not optimize without benchmark or profile.

@dvdsk
Copy link
Collaborator

dvdsk commented Sep 27, 2024

Where?

maybe the top of the source file? If you can list your inspiration and put your name on it, you made an algorithm for AGC you deserve some credit for that. Something along the likes of:

// Automatic gain control by @UnknownSuperficialNight, 
// this is a combination of <concept> other <concept> it builds upon 
// the work of <name> as seen here <blog/book/something>.

Its entirely optional if you do not feel like it you can leave this out.

@dvdsk
Copy link
Collaborator

dvdsk commented Sep 27, 2024

Integrated it with my GUI earlier, and it's working really well—Tested on a -10db audio file. Here’s a video showing the results:

Great to hear! unfortunately the video is not playing for me (neither in browser or downloaded in VLC).

Edit: video now works 🎉

Copy link
Collaborator

@dvdsk dvdsk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, this is the most well documented code I have seen in a long while ❤️. I am gonna show this to my friends. You have set a high bar for all the other PR's now :)

The only thing that could be improved is the --debug-gain flag, I think checking an argument as a library is sup-optimal.

Your comment there mentions its for both debugging and monitoring. Do you mean its useful for the end user to be able to see the number? For example so they can tweak the parameters better?

If that's true then maybe a callback would be a good idea. Automatic gain control could get an extra argument (maybe an option?).

It would look like this then:

    fn automatic_gain_control(
        self,
        target_level: f32,
        attack_time: f32,
        absolute_max_gain: f32,
        gain_monitor: Option<Box<dyn Fn(f32)>>,
    ) -> AutomaticGainControl<Self>

The current monitoring code in fn next could then become:

if let Some(callback) = self.gain_monitor {
    callback(self.current_gain);
}

I have used Box<dyn Fn> to prevent AGC getting a generic argument. That will incur a performance cost when using monitoring, its a tradeoff. What we could do is add a second source: MonitoredAutomaticGainControl which is generic over the monitor callback. Then implement the current AutomaticGainControl by just calling MonitoredAutomaticGainControl with as callback a function that does nothing. The optimizer will optimize that function out for us.

Regarding logging for debugging purposes. Its probably a good idea to add either tracing or log to Rodio. We can enable them using a compile time feature so they are zero cost for any user that does not need them. They also integrate nicely in whatever application someone is writing.

@dvdsk
Copy link
Collaborator

dvdsk commented Sep 27, 2024

I should really add some benchmarking to Rodio. AGC is now running for every sample, might be an idea to make it run only once every n samples to lower the perf hit of AGC. But that is definitely not for now but for a future PR/issue once we have benchmarking integrated into Rodio's CI.

@UnknownSuperficialNight
Copy link
Author

might be an idea to make it run only once every n samples to lower the perf hit of AGC

Running every n samples could work, but we risk issues with the audio not being able to adjust when needed. For example, a bass drop or a loud kick would possibly have a gain that is too high, making a crackling sound.

Instead, we can make a threshold that logs the previous values and detects a large enough change.

This is my testing while developing (AGC responsiveness threshold)

Performance results from testing based on cpu usage:
Every single thread (baseline):
1.4% to 4.2% of 1 core (avg 2.8%-4.2%)

Using a advanced detection method:
1.4% to 4.2% of 1 core (avg 2.8%)

Using updated and simplified advanced detection method:
1.4% to 4.2% of 1 core (same as above but no perceivable change in audio)

Although it most of the time indicates a 2.8% cpu usage, it's still such a small difference that I don't know if it's worth it. Plus, this is all using one core. By that, I mean 100% equals one core, and 200% equals two cores fully maxed out. In this case, we are using 1/36th of a core on average. So, maybe it's not worth it; the more delay we have, the more potential audio issues we will encounter, as it might not be able to react in time to lower it enough reasonably.

In any case here is the code for that if we want to explore it in the future

/// Checks if the peak level change is significant enough to trigger a gain update
///
/// This method calculates the absolute difference between the current peak level
/// and the last significant peak level. It considers a change to be significant
/// if this difference is 0.01 or greater, which is equivalent to a 1% change
/// in the signal level.
#[inline]
fn is_significant_change(&self) -> bool {
// Calculate the absolute difference directly using f32
let difference = (self.peak_level - self.last_significant_peak).abs();

// Consider a change significant if it's 0.01 or more (1% change)
 difference >= 0.01
}

@UnknownSuperficialNight
Copy link
Author

UnknownSuperficialNight commented Sep 27, 2024

I should really add some benchmarking to Rodio

definitely a good idea

Honestly, this is the most well documented code I have seen in a long while ❤️. I am gonna show this to my friends. You have set a high bar for all the other PR's now :)

Thank you so much! That really means a lot to me☺️

The only thing that could be improved is the --debug-gain flag, I think checking an argument as a library is sup-optimal.
Your comment there mentions its for both debugging and monitoring. Do you mean its useful for the end user to be able to see the number? For example so they can tweak the parameters better?
If that's true then maybe a callback would be a good idea. Automatic gain control could get an extra argument (maybe an option?).

Originally, it was just so I could view the changes while developing. However, now that I've refined the algorithm, it's become quite handy to read the values visually. This way, you don't have to guess using your ears. Realizing this, I'm planning to add a toggle for it, so we don't need to parse and loop over all arguments.

maybe the top of the source file? If you can list your inspiration and put your name on it, you made an algorithm for AGC you deserve some credit for that. Something along the likes of:

I don’t mind. ☺️ However, to clarify, my AGC algorithm is inspired by various online resources related to gain control concepts, including discussions on the RMS algorithm. While I didn't base it on a specific algorithm, I synthesized ideas from multiple sources to create my implementation.

If you're asking about the original inspiration for why I decided to work on this, it's because I am creating a graphical music player entirely in Rust, with the exception of two C backup dependencies. I wanted the audio to be automatically leveled and clear.

@UnknownSuperficialNight
Copy link
Author

Force pushed so i don't add redundant commits to the history as i just merged the commits together

@UnknownSuperficialNight
Copy link
Author

UnknownSuperficialNight commented Sep 27, 2024

I have used Box to prevent AGC getting a generic argument. That will incur a performance cost when using monitoring, its a tradeoff. What we could do is add a second source: MonitoredAutomaticGainControl which is generic over the monitor callback. Then implement the current AutomaticGainControl by just calling MonitoredAutomaticGainControl with as callback a function that does nothing. The optimizer will optimize that function out for us.

This is a good idea but i wonder since our users will only do this once probably in debug mode why not just do

// Output current gain value for debugging purposes
#[cfg(debug_assertions)]
println!("Current gain: {}", self.current_gain);

The upside is that Rust's compiler's dead code elimination typically prevents unused functions and types from being included in the final binary of the application using this library. So, if people don't call automatic_gain_control, it shouldn't be compiled in. This allows us to benefit from the debug features without affecting other users or adding MonitoredAutomaticGainControl, which could increase the Rodio binary size. However, the downside is that if we do call automatic_gain_control, it will always compile the println in debug mode. What do you think?

@dvdsk
Copy link
Collaborator

dvdsk commented Sep 28, 2024

Although it most of the time indicates a 2.8% cpu usage, it's still such a small difference that I don't know if it's worth it.

I completely agree, thats not worth the effort and increased code complexity 👍

@dvdsk
Copy link
Collaborator

dvdsk commented Sep 28, 2024

However, the downside is that if we do call automatic_gain_control, it will always compile the println in debug mode. What do you think?

If anyone needs access to the value via a callback they can always open an issue, and then its easy enough to add. Always printing in debug mode seems like a bad idea, imagine having a tui app and in debug mode the view gets corrupted 😅. But the idea of determining compile time whether to print it is good.

I see two options there (maybe there are more):

  • add a feature to rodio, something like "print-agc-gain". Would still mess up tui-apps.
  • integrate tracing or log into rodio. They are what the rust community has standardised on. They can be disabled using a compile-time-feature, and allow the app developer to determine where the log messages go (stdout/file/some online log service). They can even be configured at runtime whether to display logs from a certain dependency.

It seems best to go for the second option, though that is out of the scope for this PR, ill see if I can quickly add tracing to rodio.

@UnknownSuperficialNight
Copy link
Author

UnknownSuperficialNight commented Sep 28, 2024

It seems best to go for the second option, though that is out of the scope for this PR, ill see if I can quickly add tracing to rodio.

Yea that seems best also look into env_logger it's integrated directly with log and allows control over logging through environment variables

Like this:
DEBUG_AUDIO=1 cargo run

Clarification I've never used any of these crates before, but I've heard of env_logger and the others.

@dvdsk
Copy link
Collaborator

dvdsk commented Sep 28, 2024

Clarification I've never used any of these crates before, but I've heard of env_logger and the others.

tracing builds upon log and env_logger, its a bit heavier to compile but I think that is okay since you have to opt into it. These crates are always split into two parts: the core that only gives you the println equivalent and the part that uses those macro's to actually log somewhere. The app developer should use the latter and the libs the former.

I have added tracing on main now, see 95a466e on how to use it. And to see those traces in your tests/application you will want to do something like this: https://github.com/tokio-rs/tracing/blob/master/examples/examples/fmt.rs. For logging in AGC I would suggest tracing::trace("some msg/status").

@dvdsk
Copy link
Collaborator

dvdsk commented Sep 28, 2024

(I have not looked at the new changes yet)

I am working on adding benchmarks, and I notice that some sources accept any Sample type while others want f32. I would like to unify that in the future. As long as AGC has to cast samples to f32 it might be best to just demand that the Sample is f32 in the bounds. (see high_pass source as an example of this).

There should probably be a contributors guide that mentions stuff like this... ill add it to my todo :)

@dvdsk
Copy link
Collaborator

dvdsk commented Sep 28, 2024

As long as AGC has to cast samples to f32 it might be best to just demand that the Sample is f32 in the bounds. (see high_pass source as an example of this).

Okay I really do not know whats best now, I just noticed amplify using Sample::amplify but thats just u16 as f32 * value. I'll add a benchmark on the cost of a conversion like that. Then get back to you. Its probably basically free, on the other hand there is the sample_converter Source must be there for a reason.

@UnknownSuperficialNight
Copy link
Author

UnknownSuperficialNight commented Sep 28, 2024

There should probably be a contributors guide that mentions stuff like this... ill add it to my todo :)

Yea a decent contributors guide can go a long way

Okay I really do not know what's best now, I just noticed amplify using Sample::amplify but that's just u16 as f32 * value. I'll add a benchmark on the cost of a conversion like that. Then get back to you. Its probably basically free, on the other hand there is the sample_converter Source must be there for a reason.

Yea probably costs nothing.

@dvdsk
Copy link
Collaborator

dvdsk commented Sep 29, 2024

The reason I say this is because the default tracing_subscribers is so verbose that when using RUST_LOG=debug, it causes a delay in AGC that causes it to cut out audio constantly.

Thats with a release build? Kinda suprises me tbh. Yeah adding such a not is worth it even if its only on debug builds.

@UnknownSuperficialNight
Copy link
Author

UnknownSuperficialNight commented Sep 29, 2024

Thats with a release build? Kinda suprises me tbh. Yeah adding such a not is worth it even if its only on debug builds.

Nope that's on a debug build on release the init it is ok

@dvdsk
Copy link
Collaborator

dvdsk commented Sep 29, 2024

New improvements added in the latest commit:

Very nice, ill have a look :)

Its a pitty I did not find time to do the benchmarks earlier. They are done now though 👍. You can merge them in from main and add your own in benches/effects.rs It should be pretty easy, just copy the high_pass bench.

You run the benchmarks with cargo bench.

From what I can see the conversions to f32 are, as expected, basically free:

conversions     fastest       │ slowest       │ median        │ mean          │ samples │ iters
╰─ from_i16_to                │               │               │               │         │
   ├─ f32       254.9 µs      │ 487.7 µs      │ 256.5 µs      │ 266.1 µs      │ 100     │ 100
   ├─ i16       189.8 µs      │ 412.9 µs      │ 191.1 µs      │ 198.5 µs      │ 100     │ 100
   ╰─ u16       245.3 µs      │ 435.8 µs      │ 247.3 µs      │ 257.8 µs      │ 100     │ 100

@dvdsk
Copy link
Collaborator

dvdsk commented Sep 29, 2024

next review will have to wait until after the weekend. Have a great sunday btw 👍

@UnknownSuperficialNight
Copy link
Author

UnknownSuperficialNight commented Sep 29, 2024

next review will have to wait until after the weekend. Have a great sunday btw 👍

Ok check u then ill be here refining it more 👋

Have a great sunday aswell

@UnknownSuperficialNight
Copy link
Author

UnknownSuperficialNight commented Sep 29, 2024

When you get back have a look at this let me know if its to verbose or if there should be any tweaks

I figured id add a tldr then a full explanation for those who want it

but it might be to much info so looking for a 2nd opinion

2024-09-29_15-47

@UnknownSuperficialNight
Copy link
Author

UnknownSuperficialNight commented Sep 30, 2024

I can't think of any other improvements I could make to the code; so far, the current state is near final.

Do note that some very edge case songs can have 1 or 2 samples of slight distortion only when the sound goes from almost zero to full blast

Video below is the agc version

song_waveform.mp4

Here is the source video for reference

song_waveform2.mp4

I don't know if you can hear the little clip when the beat drops, but this kind of thing is common with agc implementations and is an edge case for almost all agc. You can mitigate it with lookahead buffers, but that adds more latency and probably won't fully stop it. Remember, it has to be from almost absolute zero.

99% of songs will not be affected this is part of that 1% that will.

Edit: if the videos do not load refresh the page

Ironically, I kind of like it; it makes the bass a little punchier at the boost from near absolute zero to max, but it is something to be acknowledged as a downside.

Edit2:
Worth noting choosing a absolute_max_gain that's lower will mitigate this and so will having a higher attack_time

The video is using the default:
attack_time=4.0
absolute_max_gain=5.0

- Increase RMS_WINDOW_SIZE for more stable measurements of very low frequencies
- Replace MIN_ATTACK_COEFF with release_coeff for improved customizability

These changes provide better handling of low-frequency content and allow
developers more control over AGC behavior, particularly in attack/release
time adjustments.
Implement get_agc_control() to allow dynamic enabling/disabling of AGC
during audio playback.
- Add references to get_agc_control method in automatic_gain_control docs
- Include a quick start example demonstrating usage of get_agc_control
@dvdsk
Copy link
Collaborator

dvdsk commented Sep 30, 2024

I figured id add a tldr then a full explanation for those who want it

ohh nice idea, love it! 👍

@dvdsk
Copy link
Collaborator

dvdsk commented Sep 30, 2024

I don't know if you can hear the little clip when the beat drops

I asked a friend to listen in, we both did not hear it. So its not an issue for us, I would say keep it as is. If someone has a problem with it they are of course free to add that look-ahead buffer as a variation on the current AGC (maybe with some generics).

impl CircularBuffer {
/// Creates a new CircularBuffer with a fixed size determined at compile time.
///
/// The `_size` parameter is ignored as the buffer size is set by `RMS_WINDOW_SIZE`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then why is it there? :)

/// which is crucial for real-time audio processing.
#[derive(Clone, Debug)]
struct CircularBuffer {
buffer: [f32; RMS_WINDOW_SIZE],
Copy link
Collaborator

@dvdsk dvdsk Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

32Kb is a rather large allocation on the stack. I am kinda surprised that clippy is not complaining here. Might be better to heap allocate this (vec).

You could also try replacing buffer+index in CircularBuffer with VecDeque. But that might have some negligible overhead. Since you added a benchmark we can measure that now 😝.


/// Calculates the mean of all values in the buffer.
///
/// This operation is O(1) due to the maintained running sum.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 🥳

/// * `release_time` - Time constant for gain decrease
/// * `absolute_max_gain` - Maximum allowable gain
#[inline]
pub fn automatic_gain_control<I>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be pub(crate) I am pretty sure this is wrong in most rodio sources, see: #622 for more info.

min_attack_coeff: release_time,
peak_level: 0.0,
rms_window: CircularBuffer::new(RMS_WINDOW_SIZE),
is_enabled: Arc::new(AtomicBool::new(true)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you added 'dynamic' controls in there. The current way we do controls in rodio is through the sink. @PetrGlad had the idea to try and do this with Arcs + Atomics just like you do here. I have no idea what the performance impact of this is, its something I have been discussing in #619 and which will move to its own issue in the future.

For now its a bit strange that enabled can be changed via a handle (gotten through get_agc_control) but the other variables can not. @PetrGlad AGC might be a good candidate for benchmarking dynamic controls (though atomics+arcs) since it has so many variables.

Copy link
Collaborator

@dvdsk dvdsk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your ahead of us regarding the AGC control. Then there is a stack size issue we might wanna adress. Finally we should add AGC to the sink so everyone can easily enable it provided that AGC has hardly any preformance impact once disabled (I think you are already benchmarking that). Lets do that in a second PR though! Or if you have no time I can do that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Idea of implementing AGC (Automatic Gain Control)
2 participants