Nebius posts 841% year-over-year AI cloud revenue growth, acquires two inference optimization teams to accelerate GPU efficiency
Key Points
- Nebius posts 841% year-over-year AI cloud revenue growth by positioning itself as a vertically integrated full-stack platform rather than a bare-metal hardware rental operation.
- Nebius acquires Aigen AI and Clarify, inference optimization teams focused on model-level and system-level efficiency, to move faster as competitive positioning shifts every three months.
- Customers typically migrate from frontier models to open-source alternatives within three to six months when cost performance improves, and Nebius aims to lower that switching barrier through better inference efficiency.
Summary
Read full transcript →Nebius
Nebius posted 841% year-over-year AI cloud revenue growth, a number that reflects both the explosion in AI infrastructure demand and the company's push to own more of the stack. Roman Chernin, a founder who has been with Nebius since the company took its current shape in summer 2024, describes the business as a vertically integrated, full-stack AI cloud — deliberately distinct from bare-metal providers that rent out hardware on multi-year contracts with months-long lead times.
The Yandex heritage matters. Chernin notes the founding team brought deep technical capability that let Nebius move fast from day one. ClickHouse came out of the same talent pool.
“The AI Cloud division posted 841% year over year revenue growth. We've got new chips every three months, new physical data centers every month, new customers at a scale we've never seen before every month — and the workloads are changing so fast. We recently announced two acquisitions of teams focused on inference optimization: one on model optimization and one on system optimization.”
Two acquisitions, one thesis
Nebius recently acquired two inference optimization teams: Aigen AI, focused on model-level techniques including speculative decoding and quantization, and Clarify, which works on system-level optimization — inference routing, KV caching, and orchestration across large compute clusters. Chernin says Nebius already had a strong internal inference team but needed to move faster. In a market where competitive positioning can shift materially every three months, falling behind on inference efficiency shows up directly in margins and in the range of customers and workloads the platform can serve.
The acquisition logic extends beyond cost. Better inference efficiency unlocks fast-growing vertical AI companies and enterprise customers who need more than raw compute — they need reliability, tooling, and a platform that can help them tune and run open-source models at scale.
The customer arc
Chernin's view of how enterprise AI workloads evolve is specific. Customers typically start with frontier models to validate a use case, then face an economics problem at scale. When open-source alternatives can deliver comparable performance for a fraction of the cost — Chernin puts the lag between frontier and open-source at roughly three to six months for most tasks — customers switch. That switch isn't trivial; it requires fine-tuning, data work, and reinforcement learning. Nebius sees lowering that barrier as a core part of its value proposition.
Constraints
The binding constraints right now are physical infrastructure and capital, not chips alone. Getting from greenfield to a functioning, GPU-filled data center — including grid connections, local power generation, and staffing — is the hard problem. Chernin argues that with unlimited capital, Nebius could move even faster on both technical and operational execution. Financing the build without destroying margins is itself part of the discipline.
Post-sales execution, he says, is the unsung engine of the business. New chip generations arrive every three months, new data centers go live every month, and customer workloads change constantly. The delivery chain — from supply to customer-facing engineering support — is where cloud businesses are actually won or lost.
Every deal, every interview. 5 minutes.
TBPN Digest delivers summaries of the latest fundraises, interviews and tech news from TBPN, every weekday.