ai21labs/Jamba-v0.1 · Hugging Face

The article introduces Jamba, a state-of-the-art, hybrid SSM-Transformer LLM developed by AI21. Jamba is a pretrained, mixture-of-experts (MoE) generative text model with 12B active parameters and a total of 52B parameters across all experts. It supports a 256K context length and can fit up to 140K tokens on a single 80GB GPU. The model is the first production-scale Mamba implementation, offering promising research and application opportunities.

Jamba requires the use of `transformers` version 4.39.0 or higher and needs to be run on a CUDA device. It can be loaded in half precision or 8-bit precision, and can be fine-tuned for custom solutions. The model has shown impressive results on common benchmarks such as HellaSwag, Arc Challenge, and WinoGrande. However, it is a base model and does not have safety moderation mechanisms, so guardrails should be added for responsible and safe use.

Key takeaways:

Jamba is a state-of-the-art, hybrid SSM-Transformer LLM developed by AI21, with 12B active parameters and a total of 52B parameters across all experts.
The model supports a 256K context length and can fit up to 140K tokens on a single 80GB GPU.
Jamba is a pretrained base model intended for use as a foundation layer for fine tuning, training, and developing custom solutions. It does not have safety moderation mechanisms.
AI21, the developer of Jamba, builds reliable, practical, and scalable AI solutions for the enterprise.

ai21labs/Jamba-v0.1 · Hugging Face

Key takeaways:

Comments (0)

Newsletter