Large Scale GAN Training for High Fidelity Natural Image Synthesis
Task: Conditional GANs
Despite recent progress in generative image modeling, successfully generating high-resolution, diverse samples from complex datasets such as ImageNet remains an elusive goal. To this end, we train Generative Adversarial Networks at the largest scale yet attempted, and study the instabilities specific to such scale. We find that applying orthogonal regularization to the generator renders it amenable to a simple "truncation trick," allowing fine control over the trade-off between sample fidelity and variety by reducing the variance of the Generator's input. Our modifications lead to models which set the new state of the art in class-conditional image synthesis. When trained on ImageNet at 128x128 resolution, our models (BigGANs) achieve an Inception Score (IS) of 166.5 and Frechet Inception Distance (FID) of 7.4, improving over the previous best IS of 52.52 and FID of 18.6.
The BigGAN/BigGAN-Deep
is a conditional generation model that can generate both high-resolution and high-quality images by scaling up the batch size and the number of model parameters.
We have finished training BigGAN
in Cifar10
(32x32) and are aligning training performance in ImageNet1k
(128x128). Some sampled results are shown below for your reference.
Evaluation of our trained BigGAN.
Model | Dataset | FID (Iter) | IS (Iter) | Download |
---|---|---|---|---|
BigGAN 32x32 | CIFAR10 | 9.78(390000) | 8.70(390000) | model|log |
BigGAN 128x128 Best FID | ImageNet1k | 8.69(1232000) | 101.15(1232000) | model|log |
BigGAN 128x128 Best IS | ImageNet1k | 13.51(1328000) | 129.07(1328000) | model|log |
BigGAN 128x128
model is trained with V100 GPUs and CUDA 10.1 and can hardly reproduce the result with A100 and CUDA 11.3. If you have any idea about the reproducibility, please feel free to contact with us.
Since we haven't finished training our models, we provide you with several pre-trained weights which have been evaluated. Here, we refer to BigGAN-PyTorch and pytorch-pretrained-BigGAN.
Evaluation results and download links are provided below.
Model | Dataset | FID | IS | Download | Original Download link |
---|---|---|---|---|---|
BigGAN 128x128 | ImageNet1k | 10.1414 | 96.728 | model | link |
BigGAN-Deep 128x128 | ImageNet1k | 5.9471 | 107.161 | model | link |
BigGAN-Deep 256x256 | ImageNet1k | 11.3151 | 135.107 | model | link |
BigGAN-Deep 512x512 | ImageNet1k | 16.8728 | 124.368 | model | link |
Sampling results are shown below.
python demo/conditional_demo.py CONFIG_PATH CKPT_PATH --sample-cfg truncation=0.4 # set truncation value as you want
For converted weights, we provide model configs under configs/_base_/models
listed as follows:
# biggan_cvt-BigGAN-PyTorch-rgb_imagenet1k-128x128.py
# biggan-deep_cvt-hugging-face-rgb_imagenet1k-128x128.py
# biggan-deep_cvt-hugging-face_rgb_imagenet1k-256x256.py
# biggan-deep_cvt-hugging-face_rgb_imagenet1k-512x512.py
To perform image Interpolation on BigGAN(or other conditional models), run
python apps/conditional_interpolate.py CONFIG_PATH CKPT_PATH --samples-path SAMPLES_PATH
To perform image Interpolation on BigGAN with fixed noise, run
python apps/conditional_interpolate.py CONFIG_PATH CKPT_PATH --samples-path SAMPLES_PATH --fix-z
python apps/conditional_interpolate.py CONFIG_PATH CKPT_PATH --samples-path SAMPLES_PATH --fix-y
@inproceedings{
brock2018large,
title={Large Scale {GAN} Training for High Fidelity Natural Image Synthesis},
author={Andrew Brock and Jeff Donahue and Karen Simonyan},
booktitle={International Conference on Learning Representations},
year={2019},
url={https://openreview.net/forum?id=B1xsqj09Fm},
}