Generative modelling of compressed image file bits
theadamcolton.github.ioi experiment with using a (mostly) unmodified llama model to generate images, by training on the bits from a lossy compression algorithm. It turns out the key is having a decoder which can give 'hints' as conditioning information for the autoregressive model, about what the decoder is going to do with the next token in the stream
Thanks!