DRINK ME: (Ab)Using a LLM to compress text

The article discusses the possibility of using large language models (LLMs) for text compression. The author uses a model to generate text and then compares it to the original source text. If the generated text aligns with the source, it's tallied and added to a compressed string. If not, the model takes a character from the source text and tries again. The author tested this method on the first chapter of "Alice's Adventures in Wonderland" and managed to compress the text to about 8% of its original size.

The author also discusses the possibility of decompressing the compressed text. The decompression function splits the compressed text into sections and generates the missing parts or directly appends the text. The author acknowledges that this method works better on data the model has been trained on. The author also raises questions about the practicality of training a model for compression, whether this method could identify training data, and if it could be extended to other data types like images.

Key takeaways:

The author explores the possibility of extracting training text from large language models (LLMs) and using these models to reproduce text they have not been directly trained on.
The author developed a solution that includes functions to load documents, generate text, compress text, and decompress text.
The method was tested on the first chapter of 'Alice's Adventures in Wonderland' and achieved significant compression, reducing the number of characters from 11,994 to 986.
The author raises questions about the practicality of training a model for the purpose of compression, the possibility of identifying training data through this method, the potential performance of different models, and the extension of this method to other data types like images.

DRINK ME: (Ab)Using a LLM to compress text

Key takeaways:

Comments (0)

Newsletter