-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NoiseSchedules cosine seems wrong and lead to division by 0 #489
Comments
Hi there, In the cosine schedule the alpha/beta are calculated with clipping, so beta_prod_t is not 0 when t=0 as far as i can see: x = torch.linspace(0, num_train_timesteps, num_train_timesteps + 1)
alphas_cumprod = torch.cos(((x / num_train_timesteps) + s) / (1 + s) * torch.pi * 0.5) ** 2
alphas_cumprod /= alphas_cumprod[0].item()
alphas = torch.clip(alphas_cumprod[1:] / alphas_cumprod[:-1], 0.0001, 0.9999)
betas = 1.0 - alphas
return betas, alphas, alphas_cumprod[:-1] however there are documented problems with the cosine scheduler, see discussion here @sRassman reports better results if you try using leading timesteps here could you try that and see if it fixes it for you? |
Hi thanks for your quick response, Indeed the alphas are clipped in the code snippet you linked, but that's not those values which are used in the scheduler. In the beginning of the alpha_prod_t = self.alphas_cumprod[timestep]
alpha_prod_t_prev = self.alphas_cumprod[timestep - 1] if timestep > 0 else self.one
beta_prod_t = 1 - alpha_prod_t
beta_prod_t_prev = 1 - alpha_prod_t_prev The issue comes from the alphas_cumprod = torch.cos(((x / num_train_timesteps) + s) / (1 + s) * torch.pi * 0.5) ** 2
alphas_cumprod /= alphas_cumprod[0].item() So I will try what @sRassman proposed and give you a feedback later 👍 |
ah yes nice spot - it seems like we should be making sure alpha cumprod is calculated from the clipped alphas before we return it from the cosine scheduler |
Hi I came across the same issue of receiving Nans with Cosine due to devision by zero. Thanks |
Dear Oded, Note that the MONAI Generative Models repository will be soon archived because the code has been integrated in MONAI core (https://github.com/Project-MONAI). Could you check if using the latest version of the schedulers from MONAI core leads to the same error? If so, we will look at it immediately. Thank you very much! Virginia |
Hi Virginia
If I understand correctly the code here:
https://github.com/Project-MONAI/GenerativeModels/blob/main/generative/networks/schedulers/scheduler.py
has been replaced with:
https://github.com/Project-MONAI/MONAI/blob/dev/monai/networks/schedulers/scheduler.py
The cosine function is exactly the same so I don't expect any difference.
I rewrote the code to work and I urge you to fix this issue. It may give
much value.
Thanks
Oded
…On Mon, Sep 23, 2024 at 11:05 AM Virginia Fernandez < ***@***.***> wrote:
Dear Oded,
Note that the MONAI Generative Models repository will be soon archived
because the code has been integrated in MONAI core (
https://github.com/Project-MONAI). Could you check if using the latest
version of the schedulers from MONAI core leads to the same error?
If so, we will look at it immediately.
Otherwise, please use that alternative repository.
Thank you very much!
Virginia
—
Reply to this email directly, view it on GitHub
<#489 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/APGGBDEX64D2DRGWMW6N7NDZX7DTTAVCNFSM6AAAAABHYBKQBCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNRXGQ4TGMJQGU>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Dear Oded Thanks. We will look into it. Thanks! Virginia |
I wanted to use a
DDPMScheduler
with a cosine scheduling and obtained images filled with nan when sampling images.I quickly inspected the code and found that it was caused by a division by 0 in the
step
function of the classDDPMScheduler
right here :beta_prod_t
being equal to 0 at step 0 when using cosine scheduler because it comes from :alphas_cumprod
calculated like so in this case :Thus, alpha_cumprod[0] = 1 and beta_prod_t = 1 - 1 = 0
I saw no issue reporting this, maybe I am using it wrong. 🤷♂️
I tried using
DDPMScheduler(num_train_timesteps=1000, schedule="cosine")
in the2d_ddpm_compare_schedulers.ipynb
and got nan filled images as result.The text was updated successfully, but these errors were encountered: