However, the approach is not without its flaws, as CriticGPT can also make mistakes. Despite this, the technique could help make OpenAI's models and tools like ChatGPT more accurate by reducing errors in human training. It could also prove crucial in helping AI models become smarter, potentially allowing humans to train an AI that exceeds their own abilities. The new technique is part of a broader effort to improve large language models and ensure that AI behaves in acceptable ways as it becomes more capable.
Key takeaways:
- OpenAI has developed a new model, CriticGPT, to assist human trainers in assessing code, which has proven to be more effective in catching bugs and providing better critiques of code.
- The company is looking to extend this approach beyond code and integrate it into their reinforcement learning with human feedback (RLHF) chat stack, which could make AI models and tools like ChatGPT more accurate by reducing errors in human training.
- The new technique could also be crucial in helping AI models become smarter, potentially allowing humans to train an AI that exceeds their own abilities.
- The development is part of a larger effort to improve large language models, ensure AI behaves in acceptable ways as it becomes more capable, and make their output more trustworthy and aligned with human values.