Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

如何调整合成音频的音量? #971

Open
AnonymousmousCoder opened this issue Apr 15, 2024 · 8 comments
Open

如何调整合成音频的音量? #971

AnonymousmousCoder opened this issue Apr 15, 2024 · 8 comments

Comments

@AnonymousmousCoder
Copy link

可能是训练音频声音就比较小的缘故,有的模型推理出来的声音很小。如何在不写临时文件的情况下,直接修改audio_fragment的音量呢?

TTS.py

for i, batch in enumerate(audio):
            for j, audio_fragment in enumerate(batch):
                max_audio=torch.abs(audio_fragment).max()#简单防止16bit爆音
                if max_audio>1: audio_fragment/=max_audio
                audio_fragment:torch.Tensor = torch.cat([audio_fragment, zero_wav], dim=0)
                audio[i][j] = audio_fragment.cpu().numpy()

@XXXXRT666
Copy link
Contributor

建议训练前先去响度匹配,这样推理出来的音频响度就会正常

@XXXXRT666
Copy link
Contributor

要是想的话你可以接一个响度匹配,使用librosa

@AnonymousmousCoder
Copy link
Author

建议训练前先去响度匹配,这样推理出来的音频响度就会正常

训练时有这个功能吗?

@XXXXRT666
Copy link
Contributor

无,Pr里面有一个

@ZhangJianBeiJing
Copy link

mark

@Wei-JL
Copy link

Wei-JL commented May 14, 2024

Pr

您好,请问能给一下详细链接吗, 没在pr中搜索到,或者有其他的方法吗?

@XXXXRT666
Copy link
Contributor

Pr

您好,请问能给一下详细链接吗, 没在pr中搜索到,或者有其他的方法吗?

#937

@panjie-payne
Copy link

自己用ffmpeg写一个就好了呀, filter volume 就可以了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants