The author also discusses his attempts to train transformers to predict cellular automata, a task which proved more challenging than expected. Despite being able to learn some rules, the models failed to generalise and struggled with tasks requiring memory and computation. The author concludes that while LLMs can mimic computation and learn implicit associations within data, they struggle with tasks requiring iterative reasoning and maintaining a consistent goal. The author suggests that this could be partially addressed through methods such as chain of thought or using other LLMs to review and correct output.
Key takeaways:
- Large Language Models (LLMs) have shown impressive capabilities but still struggle with seemingly simple tasks, leading to an exploration of their failure modes.
- Despite their ability to answer complex questions, LLMs fail at tasks such as creating wordgrids or playing sudoku, and suffer from a "Reversal Curse" where they struggle to generalize information in reverse.
- The author suggests that LLMs have a "goal drift" where they lose focus and struggle to generalize beyond the context within the prompt, and they cannot reset their own context dynamically.
- While LLMs can be improved with clever prompting and iteration, they still struggle with tasks that require memory and computation, indicating that they demonstrate more intuition than intelligence.