Baseten raises $150M to power AI inference for companies running custom fine-tuned models

Sep 5, 2025 · Full transcript · This transcript is auto-generated and may contain errors.

beds, top tier cleaning, and 24/7 concier service. It's a vacation home, but better, folks. But better. It's better than it's base 10 time. Uh uh do we have our next guest? Welcome to the stream. How are you doing? Welcome. How we doing? We're good. What's up? The eagle has landed. The eagle has landed.

Welcome to the show. Kick us off with an introduction on yourself and the company and the news. Yeah, thanks. I'm Tin. Oh, thanks for having me, guys. Um I'm Tuin. I'm the CEO of Base 10. Um you know, I'm pretty excited today. We're AI infrastructure company. We just raised 150 million bucks. That's kind of loud.

All right. That is loud. That is loud. $150 million. Congratulations. Uh, incredible. And, uh, it's great to meet you. Um, yeah. Uh, announcing a fund raise on on a Friday is not not for the week, but you guys are wasting no time. I love it. Uh, give us can can you give us quick history on the company, your background?

Yeah, look, um, my background is in machine learning and AI. I've been doing about 15 years. Um, the company's actually six years old. Um, you know, we didn't we didn't start just we didn't just success. Um, you know, the company is six years old. We've been thinking about how to turn AI into value. Yeah. For now.

I think the last two years have been a bit crazy. Um, we started focusing very aggressively on inference maybe 24 months ago. Are you guys familiar with what inference is? Yeah. Um, but I'm but I'm not familiar with where you sit in the stack. Uh, you don't build data centers. You do you don't own GPU clusters.

You sit on top of NeoClouds or on top of hyperscalers and then you sell to your companies. I see Dcript, Retool, Corora, Writer, Patreon, Picnic Health. I can imagine how those companies need an API for a model that you probably didn't train.

Uh, but you act as the inference provider for that and you sell you sell tokens to them. um like how is it going at that layer of the stack? How is the value acrruel in in that particular layer? Uh it's fantastic, right? Because look what we sit on top of actually doing inference for models is pretty hard, right?

So you got to acquire capacity, you got to optimize the models someone else has trained, you got to run them and then you need to scale them pretty gracefully with users as their apps scale. And so um the amount of work that goes into that from teams is massive and we're just trying to take that headache away. Yeah.

And so we're pretty lucky that you know we've been able to just sit on top of this AI application layer and as that's exploded we've just kind of rode the wave with it and we work with a bunch of amazing c customers like a bridge bland gamma clay open evidence notion.

what's the uh what is the durable mode like how does value acrew over the long term because I feel like you know if if if you don't own the model and can't like rent seek on that or you don't own the hardware uh all of a sudden it's like open router is going to route people to wherever the cheapest tokens are and it becomes highly competitive and the margins might compress like what what's the long-term strategy the the assumption you're making there is that all the models are just going to be the same and everyone's going to be running kind of like the same one.

So most of our customers like run different variants of the model that are very custom to them. Okay. And so so fine-tuned on open source. Yeah. So so how about let's use you said notion a customer. Maybe we use them as an example because we've been uh Yeah, that's great. Or or you know pick pick uh Yeah.

I have have you guys heard of Open Evidence? Yeah. I think we're having I think we're having the founder on next week. So they're the best. You you'll have a really good time chatting with them. you know, they're a really good example of a customer that uses us.

They train a bunch of models of their own um to basically answer the questions or the queries from the doctors.

And so, you know, what happens when you need to run those models or you need to make them wicked fast, you need to go get a bunch of GPUs from a bunch of different places and then you need to kind of scale gracefully across that. Sure. And so, how do you do that?

Well, you either go and use an inference provider like B10 or you go and um you know, build it yourself. um you know and I think the smart companies right now what they're realizing is that this is not particularly differentiated for them. So they shouldn't do the infrastructure piece.

They should focus on what is differentiated for them which is how they use the models and the applications they build and you know kind of give the boring piece to us.

And so open evidence is used by some ungodly amount of doctors every day and they do it with two great infro guys could Jag and Micah um and it's pretty amazing the scale they can achieve using um software like base 10. What uh what does the next 12 months look like for the company?

Yeah, look, this is just it's such a land grab of a market. Um, we've we've been doing this for a while, and you know, we've built some great technology, but we we can't we can't overestimate how much like a market just showed up for us. Um, and so how can you go as fast as possible?

It's actually kind of like what it feels like venture was built for. Um, you know, that's why we're raising capital right now to solve two core problems. One, you know, hire a go to market team that's going to be killer and scale that.

Um, and then hire amazing engineers that are incredibly expensive and hard to come by. Yeah. What, uh, I don't know, we could go deeper. Um, uh, last question for me. What is your view on the broad trend in token pricing?

There's been people going back and forth on whether or not we're going to get 10x uh, savings quickly in inference pricing.

uh whether it's going to be saving on a per token basis but then just using OpenAI just put a $10 billion order into Broadcom and and I'm wondering uh but then at the same time companies are just saying I want to stay on the frontier I'll use a reasoning model um I will continue to eat the high you know per token costs that come out of the bigger models yeah look I I think the token price goes down and like infrance should get cheaper over time um and I think that really just means there's got to be more inference.

You know, I think we all Jean's paradox. All of us in tech discovered Jean's paradox six months ago, but every time we lower prices for our customer or we go and optimize their models to make it cheaper, four months later, they're spending more anyway. Wow.

And so, like, yeah, inference prices are going to go down, but if if if the c if the country if the world is run by AI in 10 years from now, there's going to be a lot of inference. Y, so it better be it better be cheap.

Um, And you know, we just hope that, you know, we can power all that and be the invisible layer underneath it and hopefully scoop off a bit of value while we're at it. What's head count today? Just out of curiosity. 104, I think. 104. Let's call it that. That's a good team. That's good.

Yeah, we were about 30, I want to say, a year ago. So, it's going pretty fast. Yeah. How many pizzas is that? 100 100 people. 50 pizzas. Depends how much people eat, right? 50 pizzas. Yeah. Anyway, if it's walking season, you know, it's different. might be 100 pizzas. Yeah.

Well, anyway, thank you so much for hopping on. Great having you.

← Back to story