You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I was wondering why did you guys decide to use batch norm in a critic of WGAN-GP. The paper on Improved training of WGAN(the one where gradient penalty is proposed) advises against it.
Thanks in advance!
The text was updated successfully, but these errors were encountered:
Hi,
Although we noticed that it is not a good choice of using BN in discriminator of WGAN-GP,
we found that using batchnorm stabilized training.
We also experimented with Instance Norm , LayerNorm, and without normalization, but we could not find a better model.
Now we moved to another project so we still don't know the exact reason why this works.
Maybe further experiments and analysis will be needed to figure it out.
Thanks!
Hi, I was wondering why did you guys decide to use batch norm in a critic of WGAN-GP. The paper on Improved training of WGAN(the one where gradient penalty is proposed) advises against it.
Thanks in advance!
The text was updated successfully, but these errors were encountered: