Why AI can't spell 'strawberry'

The article discusses the limitations of large language models (LLMs) in understanding the concepts of letters and syllables, using the example of how many times the letter "r" appears in the word "strawberry". Despite their advanced capabilities, these AI models, built on transformers, break text into tokens and translate them into encodings, but fail to understand the individual letters that make up the words. This issue is deeply embedded in the architecture of these models and is not easy to fix. The problem becomes more complex as the model learns more languages, with some requiring up to 10 times as many tokens as English to communicate the same meaning.

The article also discusses image generators, which use diffusion models and are trained on large databases of images. These models perform better on larger objects but struggle with smaller details, although improvements have been made by training on more images of specific objects. OpenAI is reportedly working on a new AI product, code-named Strawberry, designed to be more adept at reasoning and capable of generating accurate synthetic data to improve LLMs. Meanwhile, Google DeepMind has unveiled AI systems designed for formal math reasoning, which have shown impressive results in solving problems from the International Math Olympiad.

Key takeaways:

Large Language Models (LLMs) like GPT-4o and Claude, despite their advanced capabilities, struggle with understanding the concepts of letters and syllables, indicating that they do not think like humans.
LLMs are built on transformers, a deep learning architecture that breaks text into tokens, but they do not understand the individual letters that make up words.
Image generators use diffusion models, which reconstruct an image from noise, and tend to perform better on larger objects but struggle with smaller details.
OpenAI is working on a new AI product code-named Strawberry, which is expected to be more adept at reasoning and can reportedly generate accurate synthetic data to improve LLMs.

Why AI can't spell 'strawberry' | TechCrunch

Key takeaways:

Comments (0)

Newsletter