Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

Papers with Code - The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"

Sep 23, 2023 - paperswithcode.com
The article discusses a notable failure of generalization in large language models (LLMs), termed as the "Reversal Curse". The models, when trained on a sentence like "A is B", fail to generalize it in the reverse form "B is A". For instance, if trained on "Olaf Scholz was the ninth Chancellor of Germany", the model cannot answer "Who was the ninth Chancellor of Germany?". The likelihood of the correct answer is not higher than for a random name, indicating a basic failure of logical deduction.

The authors provide evidence for the Reversal Curse by finetuning GPT-3 and Llama-1 on fictitious statements and showing their failure to correctly answer reversed questions. This issue is robust across model sizes and families and is not alleviated by data augmentation. The authors also evaluated ChatGPT on questions about real-world celebrities, showing a significant difference in the model's ability to answer direct and reversed questions. The authors hypothesize this failure of logical deduction is caused by the Reversal Curse.

Key takeaways:

  • Large language models (LLMs) like GPT-3 and Llama-1 fail to generalize the reversal of a sentence. For example, if trained on "A is B", they do not automatically understand "B is A". This is referred to as the Reversal Curse.
  • This failure of logical deduction is evident even when models are trained on fictitious statements. The models do not increase the likelihood of the correct answer in reverse queries.
  • The Reversal Curse is robust across different model sizes and families and is not mitigated by data augmentation.
  • Even advanced models like GPT-4 show this failure, correctly answering questions like "Who is Tom Cruise's mother?" 79% of the time, but only correctly answering the reverse question "Who is Mary Lee Pfeiffer's son?" 33% of the time.
View Full Article

Comments (0)

Be the first to comment!