Google's Logan Kilpatrick on Gemini 3 Flash: faster, cheaper, and better than Gemini 3 Pro in key benchmarks
Dec 17, 2025 · Full transcript · This transcript is auto-generated and may contain errors.
Featuring Logan Kilpatrick
to have
he's that guy
leadership in the position in the hopefully doing some cool stuff.
You know who else is that guy?
Logan Kilpatrick from Google. Of course,
he's in the re room waiting room and now he's in the TV and Ultradom. Welcome to the show, Logan. How are you doing?
He's that guy, pal. Hey guys, happy holidays.
Happy holiday. I love the background.
The UAV online is incredible. [laughter]
Well, we have we have Christmas themed ones now. So, there's Get Play the drummer boy one.
Hostile drummer boy on your six.
Hostile drummer boy on your six. That's a good one. The [laughter] team had fun with that. Anyway, uh thank you so much for joining. Uh please, uh everyone knows who you are around here, but take us through the news today. Uh Gemini 3 Flash. What are the highlights? how how are you uh positioning the communication around um that particular launch?
Yeah. No, it's a great question and thank you again for for having me on. Um Mi3 Flash is the most capable model I think along a bunch of dimensions that we've shipped before and um it's also just this model that is going to bring intelligence to everyone which is the exciting thing. So I feel like I love it. I [laughter] feel like that is we need to do this for Gemini launches is we should get I'm going to get a gong for the office and when we launch
we'll we'll help you out. We've we know all the suppliers and uh we we get bulk pricing you know we help you guys out etched etched gong for the office when we launch would be perfect. No, but I think the model is incredible and I think it builds on the momentum from 3 Pro which is really exciting. But and it's actually the crazy thing is it's actually better than 3 Pro at some dimensions.
I was looking at this I'm looking at the model card. It it outperforms on ArcGI 2 which is like my personal favorite benchmark. We love the team over at ArcGI and somehow the flash model outperforms the Pro model. Uh it like I would I would be very interested what the thesis on that is it like uh Gemini 3 Pro was overthinking it. Is that what's going on?
We also did we also did our own Ben Shrimp Bench. Uh
oh yeah shrimp bench went very well.
We had it it came up with uh
you're supposed to think of you're supposed to think of jokes that have the same format as you're telling me a shrimp fried this rice. So shrimp
for ginger snap it said you're telling me a ginger snap this cookie. [laughter]
It's pretty good. Pretty good. It's it it's funny. We're we're loving it.
Not all the some of the models absolutely fall that are that are that are quite smart. So, it's it's a good it's a good uh benchmark, sir.
Yeah. Really quick though, like the reason Flash is so much better like there's a very pointed answer on this and actually not that it's so much better.
It's better in certain places.
Some places it's a little bit better. And this is because the model is a little bit smaller. So the like iterate like we're able to do a little bit more iteration and it has like a slightly updated um like post-training recipe versus the pro model and this is just because of the timing. So like we finished pro and finished some of the training like probably a little bit more than a month ago. So like just within the last month have like made a bunch of innovation on the research side. It makes it into the smaller model just because of like the timing and the release process. Um
but so you'll you'll see some of that. So there's no like special you know silver bullet behind the scenes of making this possible. It's just like literally time and with enough time we can keep making uh even better and better models which is exciting.
Yeah. How are you thinking about the like I I feel like the Gemini team is very uh very good about offering uh different models at the along the paro frontier. um not not overfixating on a particular benchmark but delivering a high quality product at each incremental price or the best product for a given price. Um do you do you envision um the the the like the left and right bound expanding further? Is there like like a nano coming something even cheaper, even smaller, even more compressed? How do you think about um offering a product from the same sort of API uh you know structure but at you know an even cheaper price? Is that something that's coming?
Yeah. Yeah. So we're working hard on this. So the flash or the sort of three model family historically or 2.5 model family historically has been pro flash light which is the smallest model and then I think I don't know if we still call it nano or not but nano is our ondevice model which is like the the smallest model. So, we will hopefully have a flashlight model early next year. Um, it it takes time to sort of trickle the innovation through and um it's the holidays are busy. So,
I was joking with with uh with with Jordy because Nano Banana the the code name sort of leaked into like the public consciousness just because it was exciting as great model and so people figured it out. Nano banana. Uh but but it implies the existence of just a normal banana or banana pro or a mega banana or giga banana. And so we want the giga banana. We want the full image. Uh the the the even bigger model, the biggest the biggest possible model. But of course,
banana pro is your gigab banana. We actually thought about calling it Giga Giga Banana. Giga Banana.
It wasn't as catchy as Nano Banana. So
Nano Banana already has its own brand. So you kind of got to just run with that. Like you got the bird in the hand. Just run with it. Uh maybe getting a little bit more specific, how how have conversations been going with with startups that are leveraging kind of the suite of models like what where's where's where are people most excited in terms of actually implementing them? Yeah, I think the cool thing and this is um hasn't been unique to this launch, but I think the cool thing is that for 2.5 Pro customers who are paying I think it's like between a$125 and 250 per million input tokens and then like 10 to 12 per million output tokens. It this model is actually almost across the board completely three flashes better than 2.5 pro. So like if you're a startup and you're like 2.5 Pro was the model that which for meant like literally tens to hundreds of thousands of customers in production right now, this is the case for them. Um yeah,
they're able to sort of migrate over to this new model and like out of the box costsaving their product just gets faster. And I was talking to someone earlier today and like the fact that two years ago this like narrative of being a model wrapper was like a sort of like bad idea. And then now I think about this for our team actually like we have APIs for developers to build with but we're also building a vibe coding product in AI studio and like the vibe coding product we ran the like silent test behind the scenes and it was like just with three flash actually because of how much faster it was and it was like reasonably the same quality as 3 pro. Retention went up, the number of things people were building went up, engagement went up and it was like free. We didn't have to do anything. We literally just changed the model in the drop down. It's actually cheaper for us to run that. And like as a product builder, I don't think there's been a time in like human history of building startups where like you just get to show up one morning, somebody made your product 40% better and saved you 50%. Like the
the analogy is like
usually if you make something 40% better, you're like get ready to pay 50% more, you know?
Exactly. Exactly. So it's it's crazy. I I think it's like that that continues to get me just like in the fiber of my being so excited about what startups are able to go after and um yeah it's it's a great moment. So hopefully hopefully folks get a chance to adopt. And then I think the next thing that I'm excited about is like taking these these two models, Flash and Pro now become like the default models to build on top of for our other like custom variants. And like the the image generation is one example of this, but also we have text to speech models and live audio models and a bunch of other stuff in the works, computer use models, and they all sort of rebase onto this new thing and then again by default just like get better out of the box, which is really really cool. So I think you'll see the not just you know coding and all these other agentic capabilities multimodal that we've showed with this specific three flashbased model but you'll see this across pretty much every other dimension as all of those more bespoke checkpoints rebase to these new three flash and three pro variants.
What's going on with the agent builder that uh that that you guys did you roll it out last week? I'm losing I'm losing track of time but uh uh
Google. Yeah. Yeah. I know it's I know it's outside of your kind of like uh key territory, but uh I would I would love to kind of hear why what's exciting about it to you.
Yeah. I think and and so what I mean truthfully and I'll I'll give you my very candid take, which is I've been I was trying the version of that product for probably and I tweeted this out probably like three months before it came out. And the part that was most the reason I loved it so much is because inside of Google like you know we can't just use random startup products like there's like a very rigorous process in order of like what software are we actually using as Google employees and so this like homegrown solution for me to be able to like have an agent builder and like connect all my stuff up and build workflows and all these things um like really is the AI product that I get to use on a daily basis. So like I it is exceptionally important that it's worldclass and that it like pushes the frontier and that it's actually good because I I don't get the sort of free market of options. I can't just go and choose between any one of a 100 AI startups who are providing something similar to this. Um and for what it's worth I I get those in my personal life which is awesome but I don't get those in my work life. Feel like it's like sort of hit the mark of where I wanted it to be. Like I really do get genuine product utility. Um, and as somebody who my email is out on the internet, you can email me elk kilpatrickgoogle.com. Um, I get many, many emails. Um, it is helpful to like have a system. I still respond to all them manually humanly, but it is nice to have some sort of like filtering mechanisms and then sort of separate my external stuff from my internal stuff and um, it's been and also hook up to the rest of my Google information. It's awesome. Um, so I'm a huge fan and it's also just chapter one of this like AI native workspace story and I think you'll see this across other Google product services. I don't know if you all have had Robbie Stein on who leads AI product for search.
Um,
oh yeah.
Yeah, we did. I think we did. That's right.
Robbie's a man. I I love Robbie. He he we talked probably six months ago about he he was telling me this story and it was a story at the time of like how search was becoming this frontier AI product and I um sort of had to extend disbelief a little bit and I think now you sort of see this with AI like AI mode is crushing it like it really is like a just as competitive frontier AI product as a lot of the other ones that are out in the market and it's like at search scale which is crazy. So, um, they're doing a great job and I think Workspace is going to go through that same arc where it's like they're going to become a Frontier AI product, uh, which is great for the, you know, billion plus users who are using Workspace.
Anything any shipping again this year or you going to give everyone a break? Let people, uh,
let the other labs rest [laughter]
offline starting next Monday. So, we've got we've got a couple more things in the pipeline the rest of this week, which I'm excited about, and then people really will be offline. it is the the blissful last part of the year where people actually take time off and relax and then uh the road map for January is going to be ridiculous. So, um
you should try and get a massive companywide secret Santa going [laughter] just like thousands.
One more quick pitch for both of you. I've been pushing for the blimp. So, don't worry. I'm having a conversation.
We've been pushing for the blimp.
It's time to blimp. It's time to blimp.
Yeah.
Maybe going to be a blimp coming. It's in here.
I'm not gonna Google DNA. I I feel
not gonna say who's involved, but we've got got some people involved.
I [laughter] love this.
See what happens.
We would be We'd be
2026 TVPN live from the air.
Christmas gift for us.
Yeah,
that'd be incredible.
Uh what a what a year. You and the team should be incredibly proud and uh we've loved every conversation. So, thank you for being a part of this
and uh yeah, I'm glad we'll be off when you guys are off so we can all uh I expect what some someone out there to be like we're launching something on Christmas Eve
because everybody's but uh
imagine
uh anyways
but we'll cover it when we're back in January.
Awesome.
But thank you so much for taking the time to come on the show.
Have a great rest of
Cheers.
Happy holidays. Good to buy one last. If you if you don't if you don't have access to a blimp and you want to run a billboard ad, you go to adquick.com. Out of home measure out of home advertising made easy and measurable. Plan, buy, and
we need some David Senra by David Senra billboards.
Yeah. When's the billboard?
We need the $1 million ad buy.