Jamba requires the use of `transformers` version 4.39.0 or higher and needs to be run on a CUDA device. It can be loaded in half precision or 8-bit precision, and can be fine-tuned for custom solutions. The model has shown impressive results on common benchmarks such as HellaSwag, Arc Challenge, and WinoGrande. However, it is a base model and does not have safety moderation mechanisms, so guardrails should be added for responsible and safe use.
Key takeaways:
- Jamba is a state-of-the-art, hybrid SSM-Transformer LLM developed by AI21, with 12B active parameters and a total of 52B parameters across all experts.
- The model supports a 256K context length and can fit up to 140K tokens on a single 80GB GPU.
- Jamba is a pretrained base model intended for use as a foundation layer for fine tuning, training, and developing custom solutions. It does not have safety moderation mechanisms.
- AI21, the developer of Jamba, builds reliable, practical, and scalable AI solutions for the enterprise.