Jamba uses Mamba, an open-source model from Princeton and Carnegie Mellon researchers, as its base model. Despite its release under the Apache 2.0 license, AI21 Labs CEO Ori Goshen emphasizes that Jamba is a research release not intended for commercial use due to lack of safeguards against generating toxic text or addressing potential bias. A safer version of the model will be released in the coming weeks. Goshen believes that the SSM architecture holds promise and expects performance to improve with additional tweaks to Mamba.
Key takeaways:
- AI21 Labs, an AI startup, is releasing a generative model called Jamba that can handle large context windows while running on a single GPU with at least 80GB of memory.
- Jamba uses a combination of two model architectures: transformers and state space models (SSMs), with the latter being more computationally efficient and capable of handling long sequences of data.
- While Jamba has been released under the Apache 2.0 license, it's intended as a research release and not for commercial use due to lack of safeguards against generating toxic text or addressing potential bias.
- AI21 Labs CEO Ori Goshen believes that the performance of Jamba will improve as the base model, Mamba, gets additional tweaks.