Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug report: Unreliable detection of speech offset #64

Open
gaoyingming opened this issue Dec 16, 2020 · 3 comments
Open

Bug report: Unreliable detection of speech offset #64

gaoyingming opened this issue Dec 16, 2020 · 3 comments

Comments

@gaoyingming
Copy link

This interface works fine for the onset of valid speech, but it has a delay about 100 ms for the offset of speech.

For the same frame, the function 'vad.is_speech' returns different decisions when it is repeatedly called. Sometimes 'True', sometimes 'False'. I found this phenomenon usually happened to the frames around the offset position.

@OmarHory
Copy link

Same is happening to me, aggressiveness is 3 and is_speech detects my noisy laptop mic as Speech, then sometimes it goes False.. My room is completely silent.
As compared when I use a professional Microphone it does really well, I am not sure what's the problem!

@PingAnPH
Copy link

PingAnPH commented Mar 3, 2022

I meet the same problem too. If I instantiate the vad once vad = webrtc.vad(), and repeatedly detect the same segment using vad.is_speech(), I will get different results. But when i instantiate the vad again,it will keep the same result at first invoked.
It seems that the parameter of vad is always changing, does anyone konw why?

@sparshsinha123
Copy link

@gaoyingming @OmarHory @PingAnPH @wiseman do we have any reason for this? Even I observed this in some experiments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants