Sijie Zhao · Yong Zhang* · Xiaodong Cun · Shaoshu Yang · Muyao Niu
Xiaoyu Li · Wenbo Hu · Ying Shan
*Corresponding Authors
TL; DR: A video VAE for latent generative video models, which is compatible with pretrained image and video models, e.g., SD 2.1 and SVD
-
2024-06-07 🤗 We updated the text-to-image inference code for SD2.1 + CV-VAE
-
2024-06-03 We have released the inference code and model weights of CV-VAE.
-
2024-05-30 We have updated the arXiv preprint.
- Python >= 3.8 (Recommend to use Anaconda)
- PyTorch >= 1.13.0
- NVIDIA GPU + CUDA
Download the model weight from Hugging Face
python3 cvvae_inference_video.py \
--vae_path MODEL_PATH \
--video_path INPUT_VIDEO_PATH \
--save_path VIDEO_SAVE_PATH \
--height HEIGHT \
--width WIDTH
@article{zhao2024cvvae,
title={CV-VAE: A Compatible Video VAE for Latent Generative Video Models},
author={Zhao, Sijie and Zhang, Yong and Cun, Xiaodong and Yang, Shaoshu and Niu, Muyao and Li, Xiaoyu and Hu, Wenbo and Shan, Ying},
journal={https://arxiv.org/abs/2405.20279},
year={2024}
}