Mark Lohmeyer, the VP and GM for computer and ML infrastructure at Google Cloud, stated that the A3 is purpose-built to handle demanding and scalable generative AI workloads. It leverages Google's unique innovations, including their networking technologies and infrastructure processing and offloads, to support the massive scale and performance required by these workloads. The announcement was made at the Cloud Next conference.
Key takeaways:
- Google Cloud announced the launch of its H100-powered A3 GPU virtual machines, which combine Nvidia’s chips with Google’s custom-designed 200 Gpbs Infrastructure Processing Units (IPUs).
- The A3 machines are expected to be in high demand due to their focus on training and serving generative AI models and large language models.
- The A3 machines offer up to 26 exaflops of AI performance and up to 10x more network bandwidth compared to the previous-generation A2 machines.
- Mark Lohmeyer, the VP and GM for computer and ML infrastructure at Google Cloud, stated that the A3 is purpose-built to handle incredibly demanding and scalable generative AI workloads and large language models.