Textbooks Are All You Need II: phi-1.5 technical report

The article discusses the development and capabilities of a new 1.3 billion parameter model named phi-1.5. This model, which builds on the work of previous models like TinyStories and phi-1, uses Large Language Models (LLMs) to generate "textbook quality" data, enhancing the learning process compared to traditional web data. The focus of phi-1.5 is on common sense reasoning in natural language, and it has shown performance on par with models five times its size. It has also surpassed most non-frontier LLMs in more complex reasoning tasks such as grade-school mathematics and basic coding.

Phi-1.5 exhibits many traits of larger LLMs, including the ability to "think step by step" and perform rudimentary in-context learning. However, it also shares some of their drawbacks, such as hallucinations and the potential for toxic and biased generations. Despite these issues, the model shows improvement in these areas due to the absence of web data. The creators of phi-1.5 have open-sourced the model to encourage further research into these topics.

Key takeaways:

The research continues the investigation into the power of smaller Transformer-based language models, following the work of TinyStories and phi-1.
A new 1.3 billion parameter model named phi-1.5 has been created, focusing on common sense reasoning in natural language. Its performance on natural language tasks is comparable to models 5x larger.
Phi-1.5 surpasses most non-frontier Large Language Models (LLMs) on complex reasoning tasks such as grade-school mathematics and basic coding, and exhibits traits of much larger LLMs.
Phi-1.5 also has potential drawbacks, including hallucinations and the potential for toxic and biased generations. However, improvements are being made in these areas, partly due to the absence of web data.

Textbooks Are All You Need II: phi-1.5 technical report

Key takeaways:

Comments (0)

Newsletter