The distilling step-by-step method involves two stages. First, rationales are extracted from LLMs using few-shot chain-of-thought (CoT) prompting. These rationales are then used to train smaller models in a multi-task framework, which includes a rationale generation task and a standard label prediction task. The researchers found that this method outperformed standard fine-tuning methods and required less training data. The method is available for private preview on Google's Vertex AI platform.
Key takeaways:
- The researchers introduced a new mechanism called "distilling step-by-step" that allows smaller task-specific models to be trained with much less data than required by standard fine-tuning or distillation approaches.
- The mechanism extracts informative natural language rationales from large language models (LLMs), which are then used to train smaller models in a more data-efficient way.
- Experiments showed that the distilling step-by-step method achieves better performance using much less training data and smaller model sizes compared to standard fine-tuning and few-shot CoT prompted LLMs.
- Distilling step-by-step is available for private preview on Vertex AI, a Google Cloud Platform service.