The system aims to bridge the gap between sophisticated system prompts used in production AI and the basic prompts typically employed by most developers. By allowing the model to improve its problem-solving skills based on the types of problems it encounters, the approach offers a transparent and editable way to enhance LLM performance without the need for pretraining or fine-tuning. The strategies are stored in human-readable JSON format, ensuring transparency and enabling users to understand the system's learning process. The article invites feedback and discussion on this novel approach to LLM learning.
Key takeaways:
- System allows LLMs to automatically learn and improve problem-solving strategies over time by building a database of effective strategies for different problem types.
- Strategies are stored as human-readable JSON, enabling inspection and editing, and the system has shown improvements on math benchmarks like Arena Hard and AIME24.
- Implementation is an open-source plugin for optillm, compatible with any OpenAI-compatible API, featuring inference-only and learning modes.
- Approach bridges the gap between sophisticated system prompts used in production AI and basic prompts, inspired by Andrej Karpathy's "third paradigm" for LLM learning.