Welcome To Hell, OpenAI
๐ Abstract
The article discusses the challenges faced by OpenAI and Elon Musk in content moderation and the inherent difficulties in altering large language models (LLMs) like GPT-3 to produce "safer" outputs. It compares OpenAI's approach with Google's more constrained AI Test Kitchen, and argues that the only winning move is not to play the content moderation game.
๐ Q&A
[01] Welcome To Hell, OpenAI
1. What are the key challenges faced by Elon Musk and OpenAI in content moderation?
- Elon Musk, as the new owner of Twitter, is tasked with balancing the needs of advertisers and growing the user base by minimizing harmful speech, while appeasing his vocal fans who see any content moderation as a personal affront.
- OpenAI has faced similar challenges by plugging a version of its flagship LLM, GPT-3, into a public-facing chat interface (ChatGPT). This has led to an endless barrage of complaints about the bot's alleged biases and systematic favoritism or disfavoritism of certain groups.
2. How does the author describe the nature of LLM output and the difficulty in altering it?
- The author describes GPT-3 as a "vast funhouse mirror that produces bizarre distorted reflections of the entirety of the billions of words comprised by its training data."
- Altering the model to reduce the likelihood of "horrifying text" without suppressing the benign syntactic information that the "horrifying text helps to reinforce" is an "impossible job" that is "delicate" and "unpredictable."
- Each attempt to alter the model "subtly warps its other regions in unanticipated ways" that may go unnoticed until the public starts using it.
3. What is the author's view on OpenAI's decision to embark on the process of altering ChatGPT?
- The author acknowledges that it was a necessary decision, as the unabashed bigotry contained in the text produced by the unaltered GPT-3 would not be suitable for a public-facing interactive chat bot.
- However, the author argues that suppressing the problematic associations in the training data without suppressing the useful output is a "Sisyphean task" with no guarantee of success.
[02] Google's Approach
1. How does the author describe Google's approach with the AI Test Kitchen?
- Google's AI Test Kitchen provides different interfaces to LaMDA, their internal conversation-oriented LLM, which dramatically shrink the "mirror" (prompt space) that users can interact with.
- The first two approaches only allow a very limited initial prompt, while the third approach allows the user to say whatever they want but limits the response space to a tiny point.
- This approach enables Google to more accurately alter the model to produce the desired output, making it significantly more boring but also less prone to unintended behavior.
2. What is the author's assessment of Google's approach compared to OpenAI's?
- The author argues that by shrinking the size of the funhouse mirror to a manageable size, Google can more effectively anticipate and guard against the kind of unintended behavior that plagues OpenAI's ChatGPT.
- The author suggests that this approach is significantly more effective in dealing with the content moderation challenges faced by OpenAI.
[03] The Futility of Content Moderation
1. What is the author's view on the futility of content moderation for large language models like ChatGPT?
- The author argues that no matter how many times OpenAI alters ChatGPT to reduce the likelihood of problematic output, the model will ultimately produce "random bullshit" that will inevitably be interpreted as confirmatory evidence of bias by someone.
- The author suggests that the "only winning move is not to play" the content moderation game, as it is an "impossible job" with no guaranteed progress.
2. How does the author compare OpenAI's and Google's approaches to this challenge?
- The author suggests that Google's approach of dramatically shrinking the prompt and response space is a more effective way of dealing with the content moderation challenges, as it enables them to more accurately alter the model to produce the desired output.
- In contrast, the author argues that OpenAI's approach of iteratively altering the vast and unpredictable GPT-3 model is a "Sisyphean task" with no way to ensure the model has truly learned to be "safe."