Google has just unveiled its new performance-tweaked version of its Tensor Processing Unit (TPU), which is called the TPU v5p and is Google’s most powerful, scalable, and flexible AI accelerator so far.
VIEW GALLERY – 5 IMAGES
TPUs have been the basis for training and serving AI-powered products, including YouTube, Gmail, Google Maps, Google Play, and Android. Google’s just-announced generative AI model — Gemini — was trained on and is served using TPUs from Google. The company is also announcing AI Hypercomputer and Google Cloud. This new groundbreaking supercomputer architecture uses an integrated system of performance-optimized hardware, open software, leading ML frameworks, and flexible consumption models.
Google says its new TPU v5p is capable of 459 teraFLOPS of bfloat16 performance or 918 teraOPS of Int8, with a huge 95GB of HBM3 memory with up to 2.76TB/sec of memory bandwidth. Google can use as many as 8960 x TPU v5p AI accelerators together in a single pod, using Google’s in-house 600GB/sec inter-chip interconnect to train models faster or at a greater precision. This is 35% bigger than what was possible with TPU v5e and over twice as large as what was possible on TPU v4.
Google says that its new AI accelerator is capable of training popular large language models (LLMs) like OpenAI’s 175 billion parameter GPT3 at 1.9x faster using BF16 and up to 2.8x faster than its older TPU v4 AI accelerator. Each of the TPU v5p accelerators will cost $4.20 an hour to run, which is a bit more expensive than the TPU v4, which cost $3.22 an hour to run, or the TPU v5e, which costs $1.20 an hour to run.
You can read all about Google’s new TPU v5p here.