Interview

Baseten raises $150M to power AI inference for companies running custom fine-tuned models

Sep 5, 2025 with Tuhin Srivastava

Key Points

Baseten raises $150 million to serve customers running custom fine-tuned models, positioning itself as the infrastructure layer between applications and GPU capacity for companies like Notion, Gamma, and Bland.
The company's defensibility rests on model fragmentation: customers train proprietary variants tailored to their use cases, making the differentiation sit in the application layer rather than commodity compute routing.
Srivastava expects price pressure on token costs to drive higher inference volumes overall, invoking Jevons paradox to argue that cheaper inference cycles back into more spending within months.

Baseten raises $150M to power AI inference for companies running custom fine-tuned models

Summary

Baseten, a six-year-old AI inference infrastructure company, has raised $150 million. CEO Tuhin Srivastava describes the business as the layer between AI applications and GPU capacity — acquiring compute, optimizing models it didn't train, and scaling them gracefully as user demand spikes. Customers include Bland, Gamma, Clay, Notion, and Open Evidence.

“We're an AI infrastructure company. We just raised a $150,000,000. Every time we lower prices for our customer or optimize their models to make it cheaper, four months later, they're spending more anyway. Inference prices are gonna go down, but if the world is run by AI in ten years from now, there's just gonna be a lot of inference.”
— Tuhin Srivastava

The moat argument turns on model fragmentation. The standard worry about inference infrastructure — that OpenRouter-style commodity routing will compress margins as token prices fall — assumes everyone runs the same models. Srivastava says most Baseten customers run fine-tuned variants custom to their use case. Open Evidence is his clearest example: the company trains its own models to answer clinical queries from doctors, runs them at scale, and does it with a two-person infrastructure team by outsourcing the compute layer to Baseten. The differentiation sits in the application, not the plumbing, and that logic is what keeps customers from building it themselves.

On token pricing, Srivastava is relaxed. He acknowledges inference will get cheaper and invokes Jevons paradox — noting that every time Baseten lowers prices or optimizes a customer's models, that customer is spending more again within four months. He expects the same dynamic to hold industry-wide: cheaper inference drives more inference, and the total market grows.

Headcount is around 40, up from roughly 30 a year ago. The $150 million goes toward two things: building out a go-to-market team and hiring the expensive engineers required to stay competitive on model optimization and scaling. Srivastava frames this as a land-grab moment — the market arrived faster than the team anticipated, and the capital is about moving as quickly as possible to capture it.

Read full transcript →

Every deal, every interview. 5 minutes.

TBPN Digest delivers summaries of the latest fundraises, interviews and tech news from TBPN, every weekday.

You might also like...

Baseten raises $303M Series D at $5B valuation as enterprise AI inference hits an inflection point

Jan 23, 2026

Together AI raises $305M Series B at $3.3B valuation, doubling down on open-source model inference

Feb 21, 2025

Dan Shipper spins out Good Start Labs from Every — AI models that learn through gameplay, raises $3.6M

Oct 15, 2025