Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

Introducing DBRX: A New State-of-the-Art Open LLM | Databricks

Mar 27, 2024 - databricks.com
Databricks has introduced DBRX, an open, general-purpose Large Language Model (LLM) that surpasses established models like GPT-3.5 and is competitive with Gemini 1.0 Pro. DBRX uses a fine-grained mixture-of-experts (MoE) architecture, making it more efficient in terms of training and inference performance. It is particularly adept at programming tasks, outperforming specialized models like CodeLLaMA-70B. DBRX is available for Databricks customers to use via APIs, and they can also pretrain their own DBRX-class models.

DBRX was trained using a transformer-based decoder-only LLM that was trained using next-token prediction. It has 132B total parameters, of which 36B parameters are active on any input. It was pre-trained on 12T tokens of text and code data. DBRX is more efficient to train and use, with inference being up to 2x faster than LLaMA2-70B. It is also being integrated into Databricks' GenAI-powered products and is already surpassing GPT-3.5 Turbo in applications like SQL.

Key takeaways:

  • Databricks has introduced DBRX, an open, general-purpose large language model (LLM) that sets a new state-of-the-art for established open LLMs and surpasses GPT-3.5.
  • DBRX uses a fine-grained mixture-of-experts (MoE) architecture, making it more efficient and faster in inference than other models like LLaMA2-70B.
  • The model was trained on 12T tokens of text and code data, and it has been integrated into Databricks' GenAI-powered products, surpassing GPT-3.5 Turbo in applications like SQL.
  • Databricks customers can now use DBRX via APIs, and they can also pretrain their own DBRX-class models from scratch or continue training on top of one of Databricks' checkpoints.
View Full Article

Comments (0)

Be the first to comment!