Open Release of Grok-1

xAI has announced the release of the weights and architecture of their 314 billion parameter Mixture-of-Experts model, Grok-1. This base model, which was trained from scratch, is not fine-tuned for any specific application, such as dialogue. The model's raw base checkpoint was concluded in October 2023 and is being released under the Apache 2.0 license.

Grok-1 is a Mixture-of-Experts model with 25% of the weights active on a given token. It was trained on a large amount of text data using a custom training stack on top of JAX and Rust. For those interested in using the model, instructions are available on the xAI's GitHub page.

Key takeaways:

The weights and architecture of the 314 billion parameter Mixture-of-Experts model, Grok-1, are being released.
Grok-1 is a large language model trained from scratch by xAI and is not fine-tuned for any specific application.
The model's weights and architecture are being released under the Apache 2.0 license.
The model was trained using a custom training stack on top of JAX and Rust in October 2023.

Open Release of Grok-1

Key takeaways:

Comments (0)

Newsletter