NNlib.hamming_window
— Functionhamming_window(
+true
Returns:
Vector of length window_length
and eltype T
.
diff --git a/previews/PR613/.documenter-siteinfo.json b/previews/PR613/.documenter-siteinfo.json index 8674dd89..a9975401 100644 --- a/previews/PR613/.documenter-siteinfo.json +++ b/previews/PR613/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.9.4","generation_timestamp":"2024-11-16T18:44:18","documenter_version":"1.7.0"}} \ No newline at end of file +{"documenter":{"julia_version":"1.9.4","generation_timestamp":"2024-11-16T20:23:28","documenter_version":"1.7.0"}} \ No newline at end of file diff --git a/previews/PR613/audio/index.html b/previews/PR613/audio/index.html index 429bdce5..5f57b7ce 100644 --- a/previews/PR613/audio/index.html +++ b/previews/PR613/audio/index.html @@ -20,7 +20,7 @@ true julia> hann_window(N) ≈ hamming_window(N; α=0.5f0, β=0.5f0) -true
Returns:
Vector of length window_length
and eltype T
.
NNlib.hamming_window
— Functionhamming_window(
+true
Returns:
Vector of length window_length
and eltype T
.
NNlib.hamming_window
— Functionhamming_window(
window_length::Int, ::Type{T} = Float32; periodic::Bool = true,
α::T = T(0.54), β::T = T(0.46),
) where T <: Real
Hamming window function (ref: Window function § Hann and Hamming windows - Wikipedia). Generalized version of hann_window
.
$w[n] = \alpha - \beta \cos(\frac{2 \pi n}{N - 1})$
Where $N$ is the window length.
julia> lineplot(hamming_window(100); width=30, height=10)
@@ -39,15 +39,15 @@
⠀0⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀100⠀
Arguments:
window_length::Int
: Size of the window.::Type{T}
: Elemet type of the window.Keyword Arguments:
periodic::Bool
: If true
(default), returns a window to be used as periodic function. If false
, return a symmetric window.
Following always holds:
julia> N = 256;
julia> hamming_window(N; periodic=true) ≈ hamming_window(N + 1; periodic=false)[1:end - 1]
-true
α::Real
: Coefficient α in the equation above.β::Real
: Coefficient β in the equation above.Returns:
Vector of length window_length
and eltype T
.
NNlib.stft
— Functionstft(x;
+true
α::Real
: Coefficient α in the equation above.β::Real
: Coefficient β in the equation above.Returns:
Vector of length window_length
and eltype T
.
NNlib.stft
— Functionstft(x;
n_fft::Int, hop_length::Int = n_fft ÷ 4, window = nothing,
center::Bool = true, normalized::Bool = false,
-)
Short-time Fourier transform (STFT).
The STFT computes the Fourier transform of short overlapping windows of the input, giving frequency components of the signal as they change over time.
$Y[\omega, m] = \sum_{k = 0}^{N - 1} \text{window}[k] \text{input}[m \times \text{hop length} + k] \exp(-j \frac{2 \pi \omega k}{\text{n fft}})$
where $N$ is the window length, $\omega$ is the frequency $0 \le \omega < \text{n fft}$ and $m$ is the index of the sliding window.
Arguments:
x
: Input, must be either a 1D time sequence ((L,)
shape) or a 2D batch of time sequence ((L, B)
shape).Keyword Arguments:
n_fft::Int
: Size of Fourier transform.hop_length::Int
: Distance between neighboring sliding window frames.window
: Optional window function to apply. Must be 1D vector 0 < length(window) ≤ n_fft
. If window is shorter than n_fft
, it is padded with zeros on both sides. If nothing
(default), then no window is applied.center::Bool
: Whether to pad input on both sides so that $t$-th frame is centered at time $t \times \text{hop length}$. Padding is done with pad_reflect
function.normalized::Bool
: Whether to return normalized STFT, i.e. multiplied with $\text{n fft}^{-0.5}$.Returns:
Complex array of shape (n_fft, n_frames, B)
, where B
is the optional batch dimension.
NNlib.istft
— Functionistft(y;
+)
Short-time Fourier transform (STFT).
The STFT computes the Fourier transform of short overlapping windows of the input, giving frequency components of the signal as they change over time.
$Y[\omega, m] = \sum_{k = 0}^{N - 1} \text{window}[k] \text{input}[m \times \text{hop length} + k] \exp(-j \frac{2 \pi \omega k}{\text{n fft}})$
where $N$ is the window length, $\omega$ is the frequency $0 \le \omega < \text{n fft}$ and $m$ is the index of the sliding window.
Arguments:
x
: Input, must be either a 1D time sequence ((L,)
shape) or a 2D batch of time sequence ((L, B)
shape).Keyword Arguments:
n_fft::Int
: Size of Fourier transform.hop_length::Int
: Distance between neighboring sliding window frames.window
: Optional window function to apply. Must be 1D vector 0 < length(window) ≤ n_fft
. If window is shorter than n_fft
, it is padded with zeros on both sides. If nothing
(default), then no window is applied.center::Bool
: Whether to pad input on both sides so that $t$-th frame is centered at time $t \times \text{hop length}$. Padding is done with pad_reflect
function.normalized::Bool
: Whether to return normalized STFT, i.e. multiplied with $\text{n fft}^{-0.5}$.Returns:
Complex array of shape (n_fft, n_frames, B)
, where B
is the optional batch dimension.
NNlib.istft
— Functionistft(y;
n_fft::Int, hop_length::Int = n_fft ÷ 4, window = nothing,
center::Bool = true, normalized::Bool = false,
return_complex::Bool = false,
original_length::Union{Nothing, Int} = nothing,
-)
Inverse Short-time Fourier Transform.
Return the least squares estimation of the original signal
Arguments:
y
: Input complex array in the (n_fft, n_frames, B)
shape. Where B
is the optional batch dimension.Keyword Arguments:
n_fft::Int
: Size of Fourier transform.hop_length::Int
: Distance between neighboring sliding window frames.window
: Window function that was applied to the input of stft
. If nothing
(default), then no window was applied.center::Bool
: Whether input to stft
was padded on both sides so that $t$-th frame is centered at time $t \times \text{hop length}$. Padding is done with pad_reflect
function.normalized::Bool
: Whether input to stft
was normalized.return_complex::Bool
: Whether the output should be complex, or if the input should be assumed to derive from a real signal and window.original_length::Union{Nothing, Int}
: Optional size of the first dimension of the input to stft
. Helps restoring the exact stft
input size. Otherwise, the array might be a bit shorter.NNlib.power_to_db
— Functionpower_to_db(s; ref::Real = 1f0, amin::Real = 1f-10, top_db::Real = 80f0)
Convert a power spectrogram (amplitude squared) to decibel (dB) units.
Arguments
s
: Input power.ref
: Scalar w.r.t. which the input is scaled.amin
: Minimum threshold for s
.top_db
: Threshold the output at top_db
below the peak: max.(s_db, maximum(s_db) - top_db)
.Returns
s_db ~= 10 * log10(s) - 10 * log10(ref)
NNlib.db_to_power
— Functiondb_to_power(s_db; ref::Real = 1f0)
Inverse of power_to_db
.
NNlib.melscale_filterbanks
— Functionmelscale_filterbanks(;
+)
Inverse Short-time Fourier Transform.
Return the least squares estimation of the original signal
Arguments:
y
: Input complex array in the (n_fft, n_frames, B)
shape. Where B
is the optional batch dimension.Keyword Arguments:
n_fft::Int
: Size of Fourier transform.hop_length::Int
: Distance between neighboring sliding window frames.window
: Window function that was applied to the input of stft
. If nothing
(default), then no window was applied.center::Bool
: Whether input to stft
was padded on both sides so that $t$-th frame is centered at time $t \times \text{hop length}$. Padding is done with pad_reflect
function.normalized::Bool
: Whether input to stft
was normalized.return_complex::Bool
: Whether the output should be complex, or if the input should be assumed to derive from a real signal and window.original_length::Union{Nothing, Int}
: Optional size of the first dimension of the input to stft
. Helps restoring the exact stft
input size. Otherwise, the array might be a bit shorter.NNlib.power_to_db
— Functionpower_to_db(s; ref::Real = 1f0, amin::Real = 1f-10, top_db::Real = 80f0)
Convert a power spectrogram (amplitude squared) to decibel (dB) units.
Arguments
s
: Input power.ref
: Scalar w.r.t. which the input is scaled.amin
: Minimum threshold for s
.top_db
: Threshold the output at top_db
below the peak: max.(s_db, maximum(s_db) - top_db)
.Returns
s_db ~= 10 * log10(s) - 10 * log10(ref)
NNlib.db_to_power
— Functiondb_to_power(s_db; ref::Real = 1f0)
Inverse of power_to_db
.
NNlib.melscale_filterbanks
— Functionmelscale_filterbanks(;
n_freqs::Int, n_mels::Int, sample_rate::Int,
fmin::Float32 = 0f0, fmax::Float32 = Float32(sample_rate ÷ 2))
Create triangular Mel scale filter banks (ref: Mel scale - Wikipedia). Each column is a filterbank that highlights its own frequency.
Arguments:
n_freqs::Int
: Number of frequencies to highlight.n_mels::Int
: Number of mel filterbanks.sample_rate::Int
: Sample rate of the audio waveform.fmin::Float32
: Minimum frequency in Hz.fmax::Float32
: Maximum frequency in Hz.Returns:
Filterbank matrix of shape (n_freqs, n_mels)
where each column is a filterbank.
julia> n_mels = 8;
@@ -77,11 +77,11 @@
│⡇⡇⢸⠇⢸⡇⠀⣿⠀⠀⢣⡇⠀⠀⠸⣄⠇⠀⠀⠀⠸⡀⡇⠀⠀⠀⠀⠀⢱⠀⠀⠀⠀⠀⠀⠀⠀⠀⠸⡄│
0 │⣇⣇⣸⣀⣸⣀⣀⣟⣀⣀⣸⣃⣀⣀⣀⣿⣀⣀⣀⣀⣀⣿⣀⣀⣀⣀⣀⣀⣈⣇⣀⣀⣀⣀⣀⣀⣀⣀⣀⣱│
└────────────────────────────────────────┘
- ⠀0⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀200⠀
NNlib.spectrogram
— Functionspectrogram(waveform;
+ ⠀0⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀200⠀
NNlib.spectrogram
— Functionspectrogram(waveform;
pad::Int = 0, n_fft::Int, hop_length::Int, window,
center::Bool = true, power::Real = 2.0,
normalized::Bool = false, window_normalized::Bool = false,
-)
Create a spectrogram or a batch of spectrograms from a raw audio signal.
Arguments
pad::Int
: Then amount of padding to apply on both sides.window_normalized::Bool
: Whether to normalize the waveform by the window’s L2 energy.power::Real
: Exponent for the magnitude spectrogram (must be ≥ 0) e.g., 1
for magnitude, 2
for power, etc. If 0
, complex spectrum is returned instead.See stft
for other arguments.
Returns
Spectrogram in the shape (T, F, B)
, where T
is the number of window hops and F = n_fft ÷ 2 + 1
.
Example:
using FFTW # <- required for STFT support.
+)
Create a spectrogram or a batch of spectrograms from a raw audio signal.
Arguments
pad::Int
: Then amount of padding to apply on both sides.window_normalized::Bool
: Whether to normalize the waveform by the window’s L2 energy.power::Real
: Exponent for the magnitude spectrogram (must be ≥ 0) e.g., 1
for magnitude, 2
for power, etc. If 0
, complex spectrum is returned instead.See stft
for other arguments.
Returns
Spectrogram in the shape (T, F, B)
, where T
is the number of window hops and F = n_fft ÷ 2 + 1
.
Example:
using FFTW # <- required for STFT support.
using NNlib
using FileIO
using Makie, CairoMakie
@@ -104,4 +104,4 @@
fb = melscale_filterbanks(; n_freqs, n_mels=128, sample_rate=Int(sampling_rate))
mel_spec = permutedims(spec, (2, 1, 3)) ⊠ fb # (time, n_mels)
fig = heatmap(NNlib.power_to_db(mel_spec)[:, :, 1])
-save("mel-spectrogram.png", fig)
Waveform | Spectrogram | Mel Spectrogram |
---|---|---|
Settings
This document was generated with Documenter.jl version 1.7.0 on Saturday 16 November 2024. Using Julia version 1.9.4.