the downsampling ratio of VAE in maisai #1840

DopamineLcy · 2024-09-23T10:19:03Z

DopamineLcy
Sep 23, 2024

The downsampling ratio of VAE in maisai is 4, i.e. a [128, 128, 128] patch results in a [32, 32, 32] latent.
While 32 ^ 3 = 32768 is very large for a transformer-based model.
Is there any pre-trained VAE with a larger down-sampling ratio like 8 ([128, 128, 128]->[16,16,16]?

Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the downsampling ratio of VAE in maisai #1840

{{title}}

Replies: 0 comments

Select a reply

the downsampling ratio of VAE in maisai #1840

DopamineLcy Sep 23, 2024

Replies: 0 comments

DopamineLcy
Sep 23, 2024