Citations.txt

# Citations of Some Works

Wang, Xiang, et al. "VideoComposer: Compositional Video Synthesis with Motion Controllability." arXiv preprint arXiv:2306.02018 (2023).
Yang, Mengjiao, et al. "Probabilistic Adaptation of Text-to-Video Models." arXiv preprint arXiv:2306.01872 (2023).
Xing, Jinbo, et al. "Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance." arXiv preprint arXiv:2306.00943 (2023).
Wang, Fu-Yun, et al. "Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising." arXiv preprint arXiv:2305.18264 (2023).
Wang, Fu-Yun, et al. "Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising." arXiv preprint arXiv:2305.18264 (2023).
Chen, Weifeng, et al. "Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models." arXiv preprint arXiv:2305.13840 (2023).
Zhang, Yabo, et al. "ControlVideo: Training-free Controllable Text-to-Video Generation." arXiv preprint arXiv:2305.13077 (2023).
Chen, Zijiao, Jiaxin Qing, and Juan Helen Zhou. "Cinematic Mindscapes: High-quality Video Reconstruction from Brain Activity." arXiv preprint arXiv:2305.11675 (2023).
Tang, Zineng, et al. "Any-to-Any Generation via Composable Diffusion." arXiv preprint arXiv:2305.11846 (2023).
Wang, Wenjing, et al. "VideoFactory: Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation." arXiv preprint arXiv:2305.10874 (2023).
Ge, Songwei, et al. "Preserve your own correlation: A noise prior for video diffusion models." arXiv preprint arXiv:2305.10474 (2023).
Chen, Tsai-Shien, et al. "Motion-Conditioned Diffusion Model for Controllable Video Synthesis." arXiv preprint arXiv:2304.14404 (2023).
Hu, Yaosi, Zhenzhong Chen, and Chong Luo. "LaMD: Latent Motion Diffusion for Video Generation." arXiv preprint arXiv:2304.11603 (2023).
Blattmann, Andreas, et al. "Align your latents: High-resolution video synthesis with latent diffusion models." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023.
Jiang, Yuming, et al. "Text2Performer: Text-Driven Human Video Generation." arXiv preprint arXiv:2304.08483 (2023).
Liu, Vivian, et al. "Generative Disco: Text-to-Video Generation for Music Visualization." arXiv preprint arXiv:2304.08551 (2023).
An, Jie, et al. "Latent-shift: Latent diffusion with temporal shift for efficient text-to-video generation." arXiv preprint arXiv:2304.08477 (2023).
Karras, Johanna, et al. "Dreampose: Fashion image-to-video synthesis via stable diffusion." arXiv preprint arXiv:2304.06025 (2023).
Khachatryan, Levon, et al. "Text2video-zero: Text-to-image diffusion models are zero-shot video generators." arXiv preprint arXiv:2303.13439 (2023).
Ni, Haomiao, et al. "Conditional Image-to-Video Generation with Latent Flow Diffusion Models." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023.
Luo, Zhengxiong, et al. "Decomposed Diffusion Models for High-Quality Video Generation." arXiv preprint arXiv:2303.08320 (2023).
Yu, Sihyun, et al. "Video probabilistic diffusion models in projected latent space." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023.
Wang, Xiaodong, et al. "Learning 3D Photography Videos via Self-supervised Diffusion on Single Images." arXiv preprint arXiv:2302.10781 (2023).
Esser, Patrick, et al. "Structure and content-guided video synthesis with diffusion models." arXiv preprint arXiv:2302.03011 (2023).
Yu, Lijun, et al. "Magvit: Masked generative video transformer." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023.
Mei, Kangfu, and Vishal Patel. "Vidm: Video implicit diffusion models." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 37. No. 8. 2023.
He, Yingqing, et al. "Latent video diffusion models for high-fidelity video generation with arbitrary lengths." arXiv preprint arXiv:2211.13221 (2022).
Nikankin, Yaniv, Niv Haim, and Michal Irani. "Sinfusion: Training diffusion models on a single image or video." arXiv preprint arXiv:2211.11743 (2022).
Zhou, Daquan, et al. "Magicvideo: Efficient video generation with latent diffusion models." arXiv preprint arXiv:2211.11018 (2022).
Ho, Jonathan, et al. "Imagen video: High definition video generation with diffusion models." arXiv preprint arXiv:2210.02303 (2022).
Singer, Uriel, et al. "Make-a-video: Text-to-video generation without text-video data." arXiv preprint arXiv:2209.14792 (2022).
Höppe, Tobias, et al. "Diffusion models for video prediction and infilling." arXiv preprint arXiv:2206.07696 (2022).
Harvey, William, et al. "Flexible diffusion modeling of long videos." Advances in Neural Information Processing Systems 35 (2022): 27953-27965.
Yang, Ruihan, Prakhar Srivastava, and Stephan Mandt. "Diffusion probabilistic modeling for video generation." arXiv preprint arXiv:2203.09481 (2022).
Yin, Shengming, et al. "NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation." arXiv preprint arXiv:2303.12346 (2023).
Harvey, William, et al. "Flexible diffusion modeling of long videos." Advances in Neural Information Processing Systems 35 (2022): 27953-27965.

<br>

---

*Until Chapter:Human or Subject Motion*

<br>
<br>
<br>