Awesome-Autoregressive-Visual-Generation

This is a repo to track the latest autoregressive visual generation papers.

Image Tokenizers

Neural Discrete Representation Learning Paper, NeurIPS 2017
Generating Diverse High-Fidelity Images with VQ-VAE-2 Paper, NeurIPS 2019
Taming Transformers for High-Resolution Image Synthesis Paper, CVPR 2021
Autoregressive Image Generation using Residual Quantization Paper, CVPR 2022
* BEIT V2: Masked Image Modeling with Vector-Quantized Visual Tokenizers (for understanding) Paper, Arxiv 2022
Vector-quantized Image Modeling with Improved VQGAN Paper, ICLR 2022
MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation Paper, NeurIPS 2022
* PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers (for understanding) Paper, AAAI 2023
* All in Tokens: Unifying Output Space of Visual Tasks via Soft Token (for understanding) Paper, CVPR 2023
Regularized Vector Quantization for Tokenized Image Synthesis Paper, CVPR 2023
Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization Paper, CVPR 2023
Not all image regionsmatter: Masked vector quantization for autoregressive image generation Paper, CVPR 2023
Spae: Semantic pyramid autoencoder for multimodal generation with frozen llms Paper, NeurIPS 2023
HQ-VAE: Hierarchical Discrete Representation Learning with Variational Bayes Paper, TMLR 2024
Finite Scalar Quantization: VQ-VAE Made Simple Paper, ICLR 2024
Planting a seed of vision in large language model Paper, ICLR 2024
Language model beats diffusion–tokenizer is key to visual generation Paper, ICLR 2024
Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis Paper, CVPR 2024
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction Paper, NeurIPS 2024
An Image is Worth 32 Tokens for Reconstruction and Generation Paper, NeurIPS 2024
Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99% Paper, Arxiv 2024
Quantised Global Autoencoder: A Holistic Approach to Representing Visual Data Paper, Arxiv 2024
VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation Paper, Arxiv 2024
OPEN-MAGVIT2: AN OPEN-SOURCE PROJECT TOWARD DEMOCRATIZING AUTO-REGRESSIVE VISUAL GENERATION Paper, Arxiv 2024
MaskBit: Embedding-free Image Generation via Bit Tokens Paper, Arxiv 2024
ImageFolder: Autoregressive Image Generation with Folded Tokens Paper, Arxiv 2024

AutoRegressive Image Generation

Conditional image generation with pixelcnn decoders Paper, NeurIPS 2016
DiVAE : Photorealistic Images Synthesis with Denoising Diffusion Decoder Paper
Vector Quantized Diffusion Model for Text-to-Image Synthesis Paper
MaskGIT: Masked Generative Image Transformer Paper
BEIT: BERT Pre-Training of Image Transformers Paper
BEIT V2: Masked Image Modeling with Vector-Quantized Visual Tokenizers Paper
MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis Paper
Sequential modeling enables scalable learning for large vision models Paper, Arxiv 2023
4m: Massively multimodal masked modeling Paper, NeurIPS 2023
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation Paper, Arxiv 2024
ControlVAR: Exploring Controllable Visual Autoregressive Modeling Paper, Arxiv 2024
Autoregressive Image Generation without Vector Quantization Paper, Arxiv 2024
MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis Paper, Arxiv 2024
ANOLE: AnOpen,Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation Paper, Arxiv 2024
VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling Paper, Arxiv 24
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining Paper, Arxiv 24
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model Paper, Arxiv 2024
Scalable Autoregressive Image Generation with Mamba Paper, Arxiv 2024
SHOW-O: ONE SINGLE TRANSFORMER TO UNIFY MULTIMODAL UNDERSTANDING AND GENERATION Paper, Arxiv 2024
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation Paper, Arxiv 2024

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome-Autoregressive-Visual-Generation

Image Tokenizers

AutoRegressive Image Generation

About

Releases

Packages

Contributors 3

lxa9867/Awesome-Autoregressive-Visual-Generation

Folders and files

Latest commit

History

Repository files navigation

Awesome-Autoregressive-Visual-Generation

Image Tokenizers

AutoRegressive Image Generation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Packages