Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

AI startup Cerebras unveils the WSE-3, the largest chip yet for generative AI

Mar 13, 2024 - zdnet.com
Cerebras Systems, a competitor to Nvidia, has unveiled the Wafer Scale Engine 3 (WSE-3), the third generation of its AI chip and the world's largest semiconductor. The WSE-3 doubles the rate of instructions carried out from its predecessor, the WSE-2, and has shrunk its transistors from 7 nanometers to 5 nanometers, boosting the transistor count from 2.6 trillion to 4 trillion. The chip is designed for training AI models and can handle a theoretical large language model of 24 trillion parameters, significantly more than top AI tools like OpenAI's GPT-4.

Cerebras has also announced a partnership with Qualcomm to use the latter's AI 100 processor for the inference process of generative AI, which involves making predictions on live traffic. The partnership applies four techniques to reduce the cost of inference, including sparsity, speculative decoding, output conversion into a compiled version, and network architecture search. This collaboration is expected to increase the number of tokens processed on the Qualcomm chip per dollar spent by an order of magnitude.

Key takeaways:

  • Cerebras Systems has unveiled the Wafer Scale Engine 3 (WSE-3), the third generation of its AI chip and the world's largest semiconductor. The WSE-3 doubles the rate of instructions carried out, from 62.5 petaFLOPs to 125 petaFLOPs.
  • The WSE-3 has shrunk its transistors from 7 nanometers to 5 nanometers, boosting the transistor count from 2.6 trillion transistors in WSE-2 to 4 trillion. The chip is manufactured by TSMC, the world's largest contract chipmaker.
  • Cerebras' CS-3 computer with WSE-3 can handle a theoretical large language model of 24 trillion parameters, which is an order of magnitude more than top-of-the-line generative AI tools such as OpenAI's GPT-4.
  • Cerebras has also unveiled a partnership with Qualcomm to use the latter's AI 100 processor for the inference process of generative AI, which consists of making predictions on live traffic. The partnership applies four techniques to reduce the cost of inference.
View Full Article

Comments (0)

Be the first to comment!