๐ŸŒˆ Abstract

The article discusses a mysterious AI model called "gpt2-chatbot" that was made available on the platform. The author believes this model is likely an early version of GPT-4 or a similar high-capability model from OpenAI, based on various evidence such as API error messages, output quality, and model behavior.

๐Ÿ™‹ Q&A

[01] Background and Quick Rundown

1. What are the key points about the gpt2-chatbot model?

  • The model demonstrates capabilities far beyond a typical GPT-2 model, leading the author to believe it is likely an early version of GPT-4 or a similar high-end model from OpenAI.
  • The model uses OpenAI's tokenizer and exhibits OpenAI-specific vulnerabilities, suggesting it is an OpenAI model.
  • The model refers to itself as "ChatGPT" and provides detailed contact information for OpenAI when asked.
  • The model's output quality is on par with or exceeds that of other high-end models like GPT-4 and Claude Opus.

2. What are the possible reasons for OpenAI to make this model available on

  • To benchmark the latest GPT model without it being obvious that it's GPT-4.5/5, in order to:
    • Get "ordinary benchmark" test responses without elevated expectations
    • Avoid potential negative ratings due to high expectations
    • Decrease the likelihood of being "mass-downvoted" by competing entities

[02] Analysis of Service-specific Error Messages

1. How do the error messages from gpt2-chatbot compare to those of other models?

  • The error messages from gpt2-chatbot are similar to those from confirmed OpenAI models, indicating it is likely an OpenAI model.
  • Other models, such as LLaMA or Yi, provide different error messages that are specific to their own backends.

2. What does the similarity in error messages suggest about the gpt2-chatbot model? The similarity in error messages between gpt2-chatbot and confirmed OpenAI models is a strong indication that gpt2-chatbot is an OpenAI model, as it suggests the model is running on an OpenAI-managed or -connected server.

[03] LMSYS Model Evaluation Policy

1. What changes were made to LMSYS' model evaluation policy, and why?

  • LMSYS updated their model evaluation policy on 2024-04-29, likely as an ad-hoc measure in response to the gpt2-chatbot hype and the public's concerns about the opaque model information policy.
  • The changes may have been made to either stabilize the gpt2-chatbot's rating or to temporarily remove the model from the platform due to unexpectedly high traffic and capacity limits.
