Dylan Patel on Google's TPU going external: how it's already saving OpenAI 30% on Nvidia spend before deploying a single chip
Dec 1, 2025 with Dylan Patel
Key Points
- OpenAI is receiving roughly 30% off Nvidia GPU pricing due to its equity stake in the chipmaker, a structural advantage Meta cannot replicate as a public company and a primary reason Meta is seriously evaluating Google's TPU v7.
- Google's external TPU market is limited to roughly 10 sophisticated customers, but the company faces a software ecosystem gap: Nvidia's CUDA gets 40% of open-source contributions from Chinese entities like ByteDance, while Google lacks comparable third-party momentum.
- TPU deployment requires purpose-built infrastructure including custom racks three times standard width and proprietary liquid cooling, forcing potential customers like Neo Clouds to build new facilities rather than retrofit existing data centers.
Summary
Google's push to sell TPUs externally is already reshaping procurement economics before a single external chip ships at scale. OpenAI is receiving roughly 30% off Nvidia GPU pricing as a direct consequence of its equity investment in Nvidia, a deal structure that Meta cannot replicate given its size and public-company constraints. That asymmetry is one reason Meta is evaluating TPUs seriously, not as a negotiating tactic, but because TPU v7 is currently assessed as superior on performance per watt and performance per dollar, and Meta is acutely power-constrained.
Dylan Patel of Semi Analysis frames TPU's addressable external market as roughly 10 sophisticated customers, a group that includes the major AI labs and select hyperscalers. Anthropic is already an early buyer, aided by the volume of ex-Google engineers it has hired who understand the stack. Others, including Neo Clouds considering TPU deployments, are relying on Semi Analysis's TCO models to run "should-cost" analyses before entering negotiations with Google Cloud.
Software Is the Harder Problem
Google's software ecosystem sits in three buckets: proprietary tools that remain closed but are accessible via Google Cloud, software being aggressively open-sourced, and internal tools that will never be released. Semi Analysis tracked commit volume across major open-source AI libraries including PyTorch and vLLM and found Google's TPU-related commits have exploded in recent months, a deliberate strategy shift tied to the external sales push.
The comparison to Nvidia is instructive. Roughly 40% of CUDA-adjacent open-source contributions come from Chinese entities, including ByteDance and DeepSeek. Google lacks that kind of third-party ecosystem momentum. Anthropic will not open-source software, and Google's internal Gemini team and its Google Cloud TPU sales team have divergent incentives around what to release, a structural tension that has no clean resolution.
On inference performance, the honest answer is that no one knows yet. Using vLLM today, TPUs deliver worse performance-TCO than Nvidia GPUs. Semi Analysis is working to add TPU support to its Inference Max benchmark, with an internal target of completing that work by end of 2024. The TCO cost side, built from ground-up supply chain modelling covering chips, racks, liquid cooling, memory, and cabling, is assessed with reasonable confidence. Performance at realistic utilization rates remains a wide range.
Physical Infrastructure Is a Real Barrier
TPU deployment is not a drop-in replacement for Nvidia hardware. Google's racks are roughly three times the standard width, the liquid cooling supply chain is entirely vertical and sourced from vendors that have only ever sold to Google, and some data centers may not physically accommodate the hardware without structural modifications. Neo Clouds evaluating a TPU strategy may need to build purpose-built facilities rather than retrofit existing ones.
The Broadcom Dependency and the MediaTek Bet
Google has historically outsourced the majority of TPU implementation to Broadcom, which handles networking, gate layout, and supply chain negotiations. Broadcom remains the dominant networking silicon vendor globally, and Google cannot fully exit that relationship until it controls those competencies internally or through an alternative partner.
For TPU v8, Google is running a parallel development track with MediaTek. MediaTek takes significantly lower margins and does not mark up memory pass-through costs the way Broadcom does, which could meaningfully improve Google's chip economics. The risk is execution. MediaTek is a credible but clearly inferior networking partner compared to Broadcom, and the diversion of engineering resources to make that relationship work is a factor in why TPU v8 faces execution risk at a moment when Nvidia's Rubin architecture is being developed at full intensity. Patel's current assessment is that Rubin will be substantially better than TPU v8, though TPU delays or Rubin slippage could shift that calculus.
Nvidia's Synopsys Investment
Nvidia's $2 billion investment in Synopsys, announced the same day, is a strategic move to make GPUs the first-class compute platform for EDA workloads. Much of chip design tooling today runs on CPUs and FPGAs. As AI-assisted chip design scales, with Patel counting more than 20 companies now active in that space, Nvidia is positioning to capture that workload. The three dominant EDA vendors control roughly 95% of revenue in the category. Synopsys's valuation is at a multi-year low on an earnings-multiple basis, making the timing favorable for Nvidia, which is deploying capital that would otherwise have gone to dividends or buybacks. Semi Analysis has a forthcoming long-form piece mapping the AI chip design landscape.