Bug report: Unreliable detection of speech offset #64

gaoyingming · 2020-12-16T18:29:54Z

This interface works fine for the onset of valid speech, but it has a delay about 100 ms for the offset of speech.

For the same frame, the function 'vad.is_speech' returns different decisions when it is repeatedly called. Sometimes 'True', sometimes 'False'. I found this phenomenon usually happened to the frames around the offset position.

OmarHory · 2021-01-22T15:25:45Z

Same is happening to me, aggressiveness is 3 and is_speech detects my noisy laptop mic as Speech, then sometimes it goes False.. My room is completely silent.
As compared when I use a professional Microphone it does really well, I am not sure what's the problem!

PingAnPH · 2022-03-03T05:48:06Z

I meet the same problem too. If I instantiate the vad once vad = webrtc.vad(), and repeatedly detect the same segment using vad.is_speech(), I will get different results. But when i instantiate the vad again,it will keep the same result at first invoked.
It seems that the parameter of vad is always changing, does anyone konw why?

sparshsinha123 · 2022-06-03T07:12:20Z

@gaoyingming @OmarHory @PingAnPH @wiseman do we have any reason for this? Even I observed this in some experiments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug report: Unreliable detection of speech offset #64

Bug report: Unreliable detection of speech offset #64

gaoyingming commented Dec 16, 2020

OmarHory commented Jan 22, 2021

PingAnPH commented Mar 3, 2022

sparshsinha123 commented Jun 3, 2022

Bug report: Unreliable detection of speech offset #64

Bug report: Unreliable detection of speech offset #64

Comments

gaoyingming commented Dec 16, 2020

OmarHory commented Jan 22, 2021

PingAnPH commented Mar 3, 2022

sparshsinha123 commented Jun 3, 2022