Google’s New AI Chips Split Tasks to Save Power

Google has released the latest generation of its AI chips, now split into two distinct types: TPU 8t for training models and TPU 8i for inference. This separation aims to optimize performance while cutting energy and water usage across Google’s data centers.

TPU 8t – Training Chips

Purpose: Handle the intensive task of teaching AI by adjusting billions of tiny parameters.
Design Focus: Requires high memory bandwidth and rapid processing to manage complex training workloads.

TPU 8i – Inference Chips

Purpose: Execute trained models to answer queries and make predictions.
Design Focus: Lighter workload, enabling operation on less expensive hardware. The chips are smaller and consume significantly less power.

Why Separate Chips?

Efficiency: Matching chip capabilities to specific tasks reduces overall resource consumption.
Cost Reduction: Smaller, energy‑efficient inference chips lower operating expenses for cloud providers.
Environmental Impact: Less electricity and water usage align with broader sustainability goals in cloud computing.

Industry Context

Amazon’s Inferentia: Similar strategy of dedicated inference chips.
Previous Google TPU v5e: Targeted smaller tasks, but now refined into specialized models.

Uncertainties

Pricing: While Google highlights green benefits, it has not committed to passing savings on to customers. The impact on cloud service prices remains unknown.

TPU 8t – Training Chips

TPU 8i – Inference Chips

Why Separate Chips?

Industry Context

Uncertainties

Actions