ChatGPT Is a Blurry JPEG of the Web
๐ Abstract
The article discusses the similarities between Xerox photocopiers and large language models like ChatGPT, and how they both use lossy compression techniques that can lead to inaccuracies or "hallucinations" in the output.
๐ Q&A
[01] Xerox Photocopier Incident
1. What was the issue discovered with the Xerox photocopier?
- Workers at a German construction company noticed that when they made a copy of a floor plan, the room areas were all labeled as 14.13 square meters, even though the original had different areas for each room.
- This was due to the Xerox photocopier using a lossy compression format called JBIG2, which identified the room area labels as similar and only stored one of them, reusing it for all three rooms.
2. How does the Xerox photocopier issue relate to large language models like ChatGPT?
- Large language models can be thought of as "blurry JPEGs" of all the text on the web, where they retain much of the information but cannot provide exact quotes or sequences of text.
- Just as the Xerox photocopier produced plausible but incorrect labels, large language models can produce "hallucinations" or nonsensical answers that sound plausible but are not factually accurate.
- This is because the models use lossy compression techniques to reconstruct text, discarding a significant portion of the original information.
[02] Compression and Understanding
1. What is the relationship between compression and understanding, as proposed by AI researcher Marcus Hutter?
- Hutter believes that better text compression will be instrumental in creating human-level AI, as the greatest compression can be achieved by truly understanding the text.
- For example, if a compression algorithm understands the principles of arithmetic, it can compress a file of arithmetic examples much more effectively than simply storing the examples verbatim.
- Similarly, the more a compression algorithm knows about topics like economics or physics, the more it can discard redundant information when compressing related text.
2. How do large language models like ChatGPT compare to the goal of lossless compression?
- Large language models do not perform lossless compression, as they cannot reconstruct the original text precisely. Their compression is lossy, retaining the "gist" of the information but not the exact details.
- This lossy compression can create the illusion of understanding, as the models are able to rephrase information in a plausible way, similar to how a human student might express ideas in their own words.
- However, the article suggests that this ability to rephrase does not necessarily indicate genuine understanding, as the models may not have truly derived the underlying principles, as evidenced by their struggles with tasks like multi-digit arithmetic.
[03] Potential Uses and Limitations of Large Language Models
1. Can large language models replace traditional search engines?
- There are concerns about the reliability of the information in large language models, as they may have been fed propaganda or conspiracy theories, and their "blurriness" can lead to fabricated or inaccurate responses.
- It's unclear if it's technically possible to retain the acceptable "blurriness" of rephrasing information while eliminating the unacceptable blurriness of outright fabrication.
2. Should large language models be used to generate web content?
- Using large language models to generate web content would essentially be repackaging existing information, which could make it harder for people to find accurate, original information online.
- This type of content generation is similar to the work of "content mills," which the article suggests is not beneficial for people searching for information.
3. Can large language models assist human writers in creating original work?
- The article argues that starting with the output of a large language model, which is essentially a "blurry JPEG" of existing information, is not a good way to create original work.
- The process of writing, including the struggle to express one's thoughts and the awareness of the distance between the first draft and the desired outcome, is an important part of developing writing skills and generating original ideas.
- Relying too heavily on large language model output may deprive writers of this essential creative process.