Your VAE Sucks
theadamcolton.github.ioI wrote a short article about jpg and if we could use concepts from how jpg works to make an image autoencoder that has a left-to-right positional bias and variable compression
Basically, existing VAEs are pretty good at compression, but have bad properties like 2D latent position bias and difficulty training on batches of mixed resolutions
So I try something I call DCT-Autoencoder, which takes ideas from JPG to learn compression of patched DCT features of an image
Check it out!