Introducing DBRX: A New State-of-the-Art Open LLM

Databricks has introduced DBRX, an open, general-purpose Large Language Model (LLM) that surpasses established models like GPT-3.5 and is competitive with Gemini 1.0 Pro. DBRX uses a fine-grained mixture-of-experts (MoE) architecture, making it more efficient in terms of training and inference performance. It is particularly adept at programming tasks, outperforming specialized models like CodeLLaMA-70B. DBRX is available for Databricks customers to use via APIs, and they can also pretrain their own DBRX-class models.

DBRX was trained using a transformer-based decoder-only LLM that was trained using next-token prediction. It has 132B total parameters, of which 36B parameters are active on any input. It was pre-trained on 12T tokens of text and code data. DBRX is more efficient to train and use, with inference being up to 2x faster than LLaMA2-70B. It is also being integrated into Databricks' GenAI-powered products and is already surpassing GPT-3.5 Turbo in applications like SQL.

Key takeaways:

Databricks has introduced DBRX, an open, general-purpose large language model (LLM) that sets a new state-of-the-art for established open LLMs and surpasses GPT-3.5.
DBRX uses a fine-grained mixture-of-experts (MoE) architecture, making it more efficient and faster in inference than other models like LLaMA2-70B.
The model was trained on 12T tokens of text and code data, and it has been integrated into Databricks' GenAI-powered products, surpassing GPT-3.5 Turbo in applications like SQL.
Databricks customers can now use DBRX via APIs, and they can also pretrain their own DBRX-class models from scratch or continue training on top of one of Databricks' checkpoints.

Introducing DBRX: A New State-of-the-Art Open LLM | Databricks

Key takeaways:

Comments (0)

Newsletter