Ts_zip: Text Compression Using Large Language Models

5 points by adm_ 2 years ago · 1 comment

Reader

This is interesting if impractical.

It occurs to me that using LLMs for compression could, in principle, allow lossy compression of text. If a sequence of tokens happens to be costly to encode (in terms of bits), the compressor might be able to replace that sequence with a cheaper sequence that has a very similar meaning within the context. I don't imagine it would be very useful for anything but it's interesting to think about.

Settings

Ts_zip: Text Compression Using Large Language Models

Keyboard Shortcuts