Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

AI21 Labs Unveils Jamba: The First Production-Grade Mamba-Based AI Model

Mar 28, 2024 - maginative.com
AI21 Labs has launched Jamba, the first production-grade AI model based on the Mamba architecture, combining it with the traditional Transformer architecture. Jamba has a context window of 256K tokens, fitting up to 140K tokens on a single 80GB GPU, and uses just 12B of its 52B parameters during inference. This allows it to handle longer contexts than models like Meta's Llama 2, while maintaining high throughput and efficiency. Jamba's hybrid SSM-Transformer architecture uses mixture-of-experts (MoE) layers, delivering three times the throughput on long contexts compared to similar-sized Transformer-based models.

Jamba uses a blocks-and-layers approach, with each block containing either an attention or a Mamba layer, followed by a multi-layer perceptron (MLP). This results in one Transformer layer for every eight total layers, maximizing quality and throughput on a single GPU. The model has shown impressive results on various benchmarks, matching or outperforming state-of-the-art models in its size class. Jamba is released with open weights under the Apache 2.0 license and is available on Hugging Face and the NVIDIA API catalog. AI21 Labs plans to release a fine-tuned, safer version for commercial use in the coming weeks.

Key takeaways:

  • AI21 Labs has released Jamba, the world's first production-grade AI model based on the Mamba architecture, combining the strengths of the Mamba Structured State Space model (SSM) and the traditional Transformer architecture.
  • Jamba has an extensive context window of 256K tokens, fitting up to 140K tokens on a single 80GB GPU, and can handle significantly longer contexts than most of its counterparts.
  • Jamba delivers 3x throughput on long contexts compared to Transformer-based models of similar size, due to its unique hybrid architecture composed of Transformer, Mamba, and mixture-of-experts (MoE) layers.
  • Jamba is being released with open weights under Apache 2.0 license, available on Hugging Face and NVIDIA API catalog, with plans for a fine-tuned, safer version for commercial use in the coming weeks.
View Full Article

Comments (0)

Be the first to comment!