C4AI has released the open weights of both the 8B and 35B models on Hugging Face under the Creative Commons attribution-noncommercial 4.0 international public license. This allows third-party researchers to fine-tune the model to fit their individual needs. However, the release falls short of a full open source release as the training data and underlying architecture have not been released. Users can try out the new models on the Cohere Playground for free.
Key takeaways:
- Cohere for AI announced the open weights release of Aya 23, a new family of multilingual language models that serves 23 languages and outperforms other open models like Google’s Gemma and Mistral’s various open source models.
- The Aya 23 model builds on the original Aya 101 model and is part of C4AI’s Aya initiative that aims to deliver strong multilingual capabilities.
- The Aya 23 model improves on discriminative tasks by up to 14%, generative tasks by up to 20%, and multilingual MMLU by up to 41.6% compared to Aya 101.
- The open weights for both the 8B and 35B models have been released on Hugging Face under the Creative Commons attribution-noncommercial 4.0 international public license, and users can try out the new models on the Cohere Playground for free.