Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

AI2's open source Tulu 3 lets anyone play the AI post-training game | TechCrunch

Nov 21, 2024 - techcrunch.com
AI2, formerly known as the Allen Institute for AI, is working to close the gap between the open source AI community and big private companies by creating an open and easily adapted post-training regimen for large language models (LLMs). The organization argues that the post-training process, where the model is refined and made useful for specific applications, is where real value can be created. AI2 is committed to full transparency, from data collection to training methods, and aims to democratize the AI ecosystem. Its latest tool, Tulu 3, is a significant improvement over its predecessor and has achieved scores on par with the most advanced "open" models in tests.

AI2's Tulu 3 allows developers to customize their models, choosing which topics the model should focus on, and then takes it through a long regimen of data curation, reinforcement learning, fine tuning, and preference tuning. This process results in a more capable model focused on the skills needed. The organization is using this tool itself and plans to release an OLMo-based, Tulu-3-trained model soon that will be fully open source. This move is aimed at reducing reliance on major companies' resources or middlemen, which can be expensive and introduce risks, especially for companies dealing with sensitive user data.

Key takeaways:

  • AI2 is working to bridge the gap between the open source AI community and big private companies, by providing open source databases, models, and a post-training regimen to make large language models usable.
  • AI2 criticizes the lack of openness in supposedly “open” AI projects, like Meta’s Llama, where the sources and process of making the raw model and the method of training it for general use remain secret.
  • Tulu 3, a new tool by AI2, is a significant improvement over previous post-training processes, allowing developers to customize their models according to their needs.
  • AI2 plans to release an OLMo-based, Tulu-3-trained model soon that will be fully open source, providing an alternative to relying on major companies' resources or middlemen for building custom-trained LLMs.
View Full Article

Comments (0)

Be the first to comment!