The article also provides evidence supporting this theory, such as the model's use of OpenAI's tokenizer, its detailed responses to emergency/legal-related contact information, and its vulnerability to OpenAI-specific prompt injections. The author encourages readers to test the model and provide feedback, and notes that the model has a different rate limit than other GPT-4 models, which could imply higher compute costs.
Key takeaways:
- The `gpt2-chatbot` model on LMSYS is suspected to be a more advanced GPT model, possibly GPT-4.5 or GPT-5, due to its superior output quality.
- The model is believed to be a 'stealth drop' by OpenAI to benchmark their latest model without bias or intentional seeking out of the model.
- The `gpt2-chatbot` model has been verified to use OpenAI's tiktoken tokenizer and exhibits OpenAI-specific prompt injection vulnerabilities.
- The model has a different rate limit from GPT-4 models, which could imply higher compute costs and a preference for users to use the Arena (Battle) mode for generating benchmarks.