Arctic uses a Dense MoE hybrid architecture, dividing parameters into as many as 128 fine-grained expert subgroups. These experts handle only those input tokens they can process most effectively, activating only select parameters of the model in response to a query. This results in targeted performance with minimal compute consumption. Snowflake is making Arctic available inside Cortex, its own LLM app development service, and across other model gardens and catalogs, including Hugging Face, Lamini, Microsoft Azure, Nvidia API catalog, Perplexity, and Together.
Key takeaways:
- Snowflake has announced the launch of Arctic, a large language model optimized for complex enterprise workloads such as SQL generation, code generation, and instruction following.
- Arctic uses a Dense MoE hybrid architecture, where the parameters are divided into as many as 128 fine-grained expert subgroups, delivering targeted performance with minimal compute consumption.
- Arctic is available inside Cortex, Snowflake's own LLM app development service, and across other model gardens and catalogs, including Hugging Face, Lamini, Microsoft Azure, Nvidia API catalog, Perplexity and Together.
- Snowflake is also releasing a data recipe and comprehensive research cookbooks with insights into how the model was designed and trained, aiming to expedite the learning process for anyone looking into the world-class MoE models.