Google Cloud also plans to allow users to scale their TPU clusters beyond previous limits, enabling AI models to span multiple physical TPU clusters and scale to tens of thousands of chips cost-effectively. Additionally, Google announced that next month, it will make Nvidia’s H100 GPUs generally available to developers as part of its A3 series of virtual machines.
Key takeaways:
- Google Cloud has announced the launch of the fifth generation of its tensor processing units (TPUs) for AI training and inferencing at its annual user conference, Cloud Next.
- The new chip promises to deliver a 2x improvement in training performance per dollar and a 2.x5 improvement in inferencing performance per dollar, making it the most cost-efficient and accessible cloud TPU to date.
- Google Cloud has ensured that users will be able to scale their TPU clusters beyond what was previously possible, allowing a single large AI workload to span multiple physical TPU clusters scaling to tens of thousands of chips.
- Google also announced that next month, it will make Nvidia’s H100 GPUs generally available to developers as part of its A3 series of virtual machines.