Google plans nearly two million custom AI chips with Marvell, new memory and inference designs target TPU bottlenecks, Broadcom dependence persists through 2031 deal

Images

Google plans nearly two million new AI chips as it turns to Marvell for custom designs the-decoder.com

Google is in talks with Marvell Technology to design two new specialised chips for its data centres, including a memory processing unit meant to run alongside Google’s in-house Tensor Processing Units and a new TPU focused on inference, The Information reports, according to The Decoder. The project would scale quickly: Google is said to be planning nearly two million of the memory processing units, with the design expected to be finalised next year.

The move is part capacity plan and part supplier politics. Google already relies on custom silicon to reduce exposure to Nvidia’s pricing and supply constraints, but it also relies heavily on Broadcom for TPU design work. The Decoder notes that Broadcom charges per-unit fees for each TPU produced, and that Google’s Marvell talks are framed as a way to reduce that dependence—while Broadcom remains embedded through a contract running to 2031.

A two-million-unit run is not a science project; it is a bet that AI workloads will keep growing even as model training becomes more expensive and power-hungry. Splitting tasks between compute-heavy TPUs and memory-oriented accelerators is a practical response to a bottleneck that shows up once models are deployed at scale: serving “finished” models is often limited by moving data rather than doing arithmetic. An inference-specific TPU suggests Google is also preparing for a world where the marginal cost of answering queries—latency, electricity, memory bandwidth—matters as much as the one-time cost of training.

Marvell’s role also illustrates how the AI hardware stack is consolidating around a small set of designers and foundries. The Decoder points out Marvell designed an inference chip for Groq, whose technology Nvidia licensed in late 2025 for $20 billion, later unveiling Groq-based rack systems at GTC 2026. The industry’s “alternatives” to Nvidia often end up feeding back into the same ecosystem: the most successful designs get bought, licensed, or copied into the dominant vendor’s product line.

For Google, custom chips are not just about speed. Owning the roadmap means controlling which workloads are cheap enough to offer as products—and which remain too costly to serve. If the plan holds, Google’s next AI scaling step may be measured less in new models than in how many inference requests its data centres can afford to answer per second.

The reported target is nearly two million units before the design is even final.