Google on Thursday (December 7) released a new Tensor Processing Unit (TPU V5P), a supercomputer architecture AI Hypercomputer, and a resource management tool Dynamic Workload Scheduler to help organizations perform and process AI tasks.
Google began the launch of Cloud TPU V5E in November this year, and this week released Cloud TPU V5P, which emphasizes cost-effectiveness, and Cloud TPU V5P, which claims to be the most powerful TPU to date. With 8,960 chips per TPU V5P pod and 4,800 Gbps interconnect speeds, Cloud TPU V5P delivers 2x the FLOPS and 3x the high-bandwidth memory (HBM) compared to the previous generation TPU V4.
Since Cloud TPU V5P is performance-oriented, it is 2% faster than TPU V4 when training large LLM models8 times, with the help of the second generation of Sparsecores, TPU V5P trains embedded dense models 1 faster than TPU V49 times.
AI Hypercomputer, on the other hand, is a supercomputer architecture that integrates performance-optimized hardware, open-source software, major machine Xi frameworks, and flexible consumption models. Google explained that while AI hypercomputers have traditionally been used to handle demanding AI tasks by reinforcing disparate components, AI hypercomputers use co-design on the system to improve the efficiency and productivity of AI in training, fine-tuning, and service.
In terms of hardware performance optimization, AI Hypercomputer has an optimized design of computing, storage, and network equipment based on hyperscale data center infrastructureIt also allows developers to access hardware to fine-tune and manage AI tasks through open-source software, including support for machine Xi frameworks such as Jax, TensorFlow, and PyTorch, as well as software such as Multislice Training and MultiHost Inferencing, and deep integration with Google Kubernetes Engine (GKE) and Google Compute Engine.
In addition to Committed Use Discounts (CUDs), On-Demand and Spot, AI Hypercomputer also provides two consumption models designed for AI tasks through the new Dynamic Workload Scheduler start and calendar.
Dynamic Workload Scheduler is a resource management and task scheduling platform that supports Cloud TPUs and NVIDIA GPUs to schedule all the accelerators needed simultaneously to help users optimize spending. Flex Start is mainly used to fine-tune models, experiments, short training tasks, distillation, offline inference, and batch tasks, and is a relatively economical option when preparing to execute the requested GPU and TPU capacity.
The Calendar mode, on the other hand, can reserve a start time for AI tasks, which is suitable for training and experimental tasks that require a precise start time and a specific duration, and can request GPU capacity in the fixed time area, which can last for 7 days or 14 days, and can be purchased up to 8 weeks in advance.
***google cloud