Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于识别几秒时长的语音 #204

Open
xingjunhong opened this issue Jul 31, 2023 · 3 comments
Open

关于识别几秒时长的语音 #204

xingjunhong opened this issue Jul 31, 2023 · 3 comments

Comments

@xingjunhong
Copy link

假设:有一段几秒的语音,其中有关键词在语音内,其余的都是杂音。
问题:如何找到关键词的开始位置和结束位置,并且将其识别?

@majianjia
Copy link
Owner

喂数据的时候是一帧一帧的滑动窗口,你可以结合vad来做起始和结束时间戳

@xingjunhong
Copy link
Author

我看main_pc.c脚本,推理时,是每一秒都有一个推理结果,可以用这个结果来当做起始位置吗?

@majianjia
Copy link
Owner

取决于你用什么类型的模型,如果是RNN那种,是每十几毫秒就有一帧

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants