Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about noise scheduling process. #8

Open
LEECHOONGHO opened this issue Jun 24, 2022 · 1 comment
Open

Question about noise scheduling process. #8

LEECHOONGHO opened this issue Jun 24, 2022 · 1 comment

Comments

@LEECHOONGHO
Copy link

LEECHOONGHO commented Jun 24, 2022

Hello I'm trying to implement noise scheduling process refer to BDDM's implementation BDDM/sampler.py

And I have some question for noise scheduling process for FastDiff-TTS.

  1. In the Fastdiff paper, the alphaN, betaN is set as hyperparameter like αˆt = 0.54, βˆt = 0.70. Can I use this hyper parameter for my own Fastdiff-TTS module or another number of reverse steps(ex) 6, 8, 10...)? How does it Calculated?

  2. For BDDM, searching alphaN, betaN requires some greedy searching with search_bin=9, and further searching step=10 for adding noise for params. ex) _alpha_param = alpha_param * (0.95 + np.random.rand() * 0.1)
    Dose Fastdiff requires similar process like above?

  3. For BDDM, STOI and PESQ is estimated for generated audio to find best noise schedule. How could we select best parameters based on two indicators STOI and PESQ?

  4. Are STOI and PESQ also needed for parameter searching process for Fastdiff?

  5. In BDDM, num_reverse_steps = math.floor( T / tau ). But in Fastdiff, T=1000, tau=200 and num_reverse_steps=4. Do I need to calculate num_reverse_steps by math.floor(T/tau) - 1?
    image

Thank you.

@Rongjiehuang
Copy link
Owner

Hi,

  1. It's OK to use another number of reverse steps, and just set the maximum number of sampling steps in scheduling ("N") in BDDM.
  2. the noise predictor of FastDiff shares a similar mechanism as BDDM's, and thus the calculation of STOI and PESQ is required.
  3. Thanks, this $\tau$ is a typo, and the algorithm still remains math.floor(T/tau). You could try it yourself: the higher $\tau$ is, the shorter the predicted inference schedule tends to be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants