Chess as a case study in hidden capabilities in ChatGPT

The article discusses the ability of ChatGPT, an AI model, to play chess. While it's commonly believed that ChatGPT can't play chess, the author argues that with the correct prompt, it can play at around 1000 Elo, making consistently legal moves for about 20-30 moves. The author uses a "magic prompt" to guide ChatGPT, providing the game's score in algebraic notation before each move. The author also highlights that without this prompt, ChatGPT struggles to play a fully legal game of chess.

The author further compares the performances of ChatGPT-3.5 and ChatGPT-4, noting that the latter can play legally for a longer time even without the magic prompt. The author speculates that the improvement in performance with the prompt is due to the entire game score being provided at each step, which better matches the chess game scores seen in ChatGPT's training. The author concludes by inviting further exploration of hidden capabilities of AI models that can be elicited with non-obvious prompts.

Key takeaways:

ChatGPT can play chess when correctly prompted, playing at around 1000 Elo and making consistently legal moves until about 20-30 moves in.
The author uses a "magic prompt" to get ChatGPT to play chess, which involves providing the current game score before each of ChatGPT's moves, and asking for the next move in algebraic notation with no other commentary.
ChatGPT-3.5 performs significantly better with the magic prompt than without it, suggesting that it only displays careful knowledge of the game's rules when prompted with a specialized prompt, and relies on opening memory and general patterns when no specialized prompt is used.
ChatGPT-4 can play legally for a long time even without the magic prompt, showing improvement over ChatGPT-3.5. The author speculates that the improvement might be due to the network storing a representation of the then-current state of the game at each token in the chess game score.

Chess as a case study in hidden capabilities in ChatGPT

Key takeaways:

Comments (0)

Newsletter