The authors conclude that fine-tuning is a powerful tool for NL-to-SQL tasks, matching the state of the art in accuracy while being faster and cheaper. However, they note that few organizations have NL-to-SQL training datasets readily available, suggesting that the best architectures will combine fine-tuned models with Retrieval-Augmented Generation (RAG) agents. They anticipate further progress in the field with the launch of GPT-4 fine-tuning.
Key takeaways:
- The article discusses the process of fine-tuning OpenAI's GPT-3.5-Turbo for Natural Language to SQL (NL-to-SQL) tasks using the Spider dataset from Yale University. The aim is to enable non-technical users to ask questions from a database.
- The training dataset for fine-tuning is created in a specific format, including system prompts (containing instructions, database schema, and content), user prompts (the natural language question), and assistant prompts (the SQL query and reasoning step).
- The fine-tuned GPT-3.5-Turbo model showed a performance improvement of nearly 11 percent, matching the accuracy of the current state-of-the-art approach, DIN-SQL + GPT-4. Moreover, the fine-tuned model significantly reduced both cost and processing time compared to the DIN-SQL + GPT-4 approach.
- Despite the success of fine-tuning, the authors note that few organizations have NL-to-SQL training datasets readily available. They suggest that the best architectures will combine fine-tuned models with Retrieval-Augmented Generation (RAG) agents, especially with the anticipated launch of GPT-4 fine-tuning.