The Trainium2 chip is expected to have 32 cores and will likely be shrunk from the 7 nanometer processes used for Trainium 1 down to 4 nanometer processes. This would allow the doubling of the cores to be done within the same or only slightly higher power envelope of Trainium1. AWS is also expected to increase the aggregate NeuronLink bandwidth on the Trainium2 by 33 percent to 256 GB/sec per port, yielding 2 TB/sec coming out of the Trainium2. The Trainium2 is expected to deliver around 2X the performance of the Nvidia H100, which means it will compete with the H200 that Nvidia just announced with fatter and faster HBM3e memory.
Key takeaways:
- Amazon Web Services (AWS) has revealed its second generation of Trainium AI training accelerators, the Trainium2 chip, alongside the Graviton4 server CPU at the recent re:Invent 2023 event.
- The Trainium2 chip is expected to have more cores and more memory bandwidth, with the effective performance of the chips potentially scaling to 4X that of the Trainium1 on real-world AI training workloads.
- It is speculated that the Trainium2 chip will have 32 cores and will be shrunk from the 7 nanometer processes used for Trainium 1 down to 4 nanometer processes.
- AWS is expected to price the EC2 instances based on Trainium2 relative to instances using the H100 and H200 GPUs from Nvidia at the same ratio as it does between its own Graviton CPUs and AMD and Intel X86 processors, offering somewhere between 30 percent to 40 percent better bang for the buck.