Google's Logan Kilpatrick on Gemini 2.5 Pro, the four converging AI curves, and why builders have never had it better

Apr 22, 2025 · Full transcript · This transcript is auto-generated and may contain errors.

the show now. Let's bring Logan into the studio. How you doing, Logan? Let's see. We're bringing him in. How you doing? I'm doing great. It's been a great uh busy last six months of AI stuff, so I'm trying to stay alive. Just the last six months. I feel like every six days is uh is huge in the AI world.

It really is like the best place to do content around or just read about or or listen to podcasts about like the AI world is just so fertile regardless of what you think about like Poom and acceleration and all that. Just in terms of like the applications, the deals that are getting done, it's fascinating.

Uh so yeah, what what are you watching today? What are you watching this week? What's most interesting? Yeah, that's a good question. I think continued momentum of 2. 5 Pro on the Gemini side. I think obviously a bunch of new open AI models which has been awesome to see.

It's also been back to your point about how much fertility there is in like different AI stuff.

I think if you look like two years ago it was really just the models and like the model cadence of launching was like actually quite slow relative to like products and now we have all these products which people are really excited about and like the product innovation actually happens a lot faster.

So it's like acceleration across the model category but also across all the product category and like there's just there's too much to keep up with at this point. It's it's impossible.

Are you generally feeling the acceleration because I feel like we we are seeing acceleration on the product side and certainly in the the fragmentation of models and the spec and like the the specialization of these models.

But in terms of just like massive order of magnitude breakthroughs, I feel like when we went from three chat GPT3 to GPT4, that was a huge leap. We passed the touring test. Chat GPT was huge. And then since then, it's been more incremental, extremely valuable, extremely great. I love it.

But I haven't I haven't seen as many of those like viral moments. Studio Gibli accepted probably. Yeah. I I actually think I I'll push you on this point, which is I think we've had more of those large moments than I think people appreciate.

And I I think it's just like um it's actually hard to appreciate some of those moments if there isn't a product experience that brings it to life.

And I actually think that's been a lot of the gap is like if you look at like multimodal like the fact that the models can like with better better than just from a multimodal input perspective better than humans are at like most vision tasks like the number of products and like things that that unlocks is like truly mind-blowing if you look at two years ago what you would have had to do to make a make something work in that ecosystem.

Um, and like that is from an order of magnitude of impact like massive uh maybe it's not as big as like text.

Uh, but it's still this like huge massive and and like we're continuing to see that with like you know now the models are really good at tools and now the models can actually generate images and audio and video and all this stuff. So I I think we're getting those.

I also think our expectations have just gone up so much that we're like if the thing isn't you know brushing my teeth for me then like oh it's not it's no longer Yeah. Yesterday we read a post from uh near cyan and uh it was a it was a screenshot of the definition of AGI. Read it out.

So it was a reminder of how far AGI goalposts have moved and and it's a screenshot. It says an AGI could beat you at chess, tell you a story, bake you a cake, describe a sheep, and name three things larger than a lobster. And it's funny because like okay, everything we're doing everything except baking you a cake. Yeah.

And even then it'll give you a great recipe and it'll give you a recipe for any type of cake you could possibly imagine. A sheep sized cake can do anything. Anyway, um I want to talk about this uh this idea of like the paro frontier in AI models.

Uh Sean, previous guest on the show, uh mentioned that Google has been very good at delivering not just high quality models but affordable models.

uh has that been a deliberate strategy to try and create the best model at every at every different price point or is that just a natural outgrowth of the engineering culture and the scale of you know being a hyperscaler? Yeah, that's a great question. I think a lot of this has been an intentional decision.

I think if you look at like specifically 1. 5 like the flash series is really where we sort of landed this point the most. I think recently with 2.

5 Pro, it's been the first time that we've actually had like truly the one of the most intelligent models available and relative from a cost perspective, it's still super affordable.

Um, but I think it also just like why we've been able to do that and why we focus on it is like goes back to like you know Google controls from a product perspective like all the way to how the models are delivered to how the models are trained down to the silicon.

So like you can make decisions assuming a bunch of those things are going to be true which just like a lot of folks don't have the flexibility to make those decisions and and the beautiful thing is like I think this this point goes underscored which is like what does this mean?

It means that builders have the freedom to do stuff. It's like not that like Google has this really great advantage and the you know thing we do with the advantage is find out how to milk money out of people. It's like we have this advantage and what does it mean?

It means the world gets cheaper AI models and they get to go and build the products and like the margin for people building AI products actually goes up every like the farther you push the paro frontier the more money builders get to make which is like such a interesting and like unique thing about this AI moment that I actually don't think has been true in a lot of these other platform shifts that have happened in the past.

Yeah, I I I want to dig in there. Um, Google's in a unique position in that it's a con consumer tech giant, but also a scaled infrastructure provider with GCP. Um, and you could see, not to mention a B2B player. I mean, I don't know a single startup that doesn't run on Google Workspace 100%.

And so, and so you uh like Ben Thompson was just talking yesterday about uh he's really pushing OpenAI to just go full consumer and let Microsoft handle all of the B2B stuff. Don't let the API load take away from serving your consumers.

Um in goo uh w with Google, there's probably some sort of tension there, but what are you excited about on the on the B2B side? And this idea of like, you know, oh, if I build a rapper, is Google going to like just steamroll me?

This is kind of like an old meme, but now um you know, it seems like the best time ever to build on top of the like 2. 5 or any of these different models that have great cost and and uh and like scores and what and benchmarks and whatnot. So, uh what what message are you sending to kind of the the developer community?

Yeah. So, two two things. One, a reaction to your first comment which is around like um is there value in doing both consumer enterprise and some of these other things?

I think my push for companies like if you're in the position to be able to do that um and like there's a bunch of nuance to this because obviously open AI as an example is like extremely well capitalized etc etc like they have the means to do this and do it well uh but for companies that can I think part of my core worldview is a lot of the reason that chatbt has been as successful as it has is because there was an API business built around it and if you think about it through this lens of what the API business did was like allow the world allow this like mass massive proliferation of AI products to educate the masses that people are interested in this stuff and then like at the end of the day who has the world's best AI product I think like you know some people maybe argue it's catchy maybe it's another product but like you sort of wet the appetite of the world that like oh I can actually do these things and it's you know in my product surface or etc etc and all that's enabled by the API and then when you look for like okay what's the best way for me to use this product maybe it's one of those consumer products made by one of the the larger labs and I think that that playbook actually works a thought and that's why I'm extremely bullish for people who are sort of doing both of these things.

Um I I think to your to your question about like where's the value for builders today, I 100% agree with you. I think there's so much value to be created at the application layer. If you look at I I think about these like three curves at the same time.

There's like on one hand the cost of AI going down into the right you know cost of AI down 99% over the last two years. The intelligence of AI is up and to the right. Like models continue to get smarter and better with test time compute and scaling and stuff.

At the same time that the models are getting smarter, the costs are going down, consumer understanding of AI is going up. Um, and then in parallel to that, as consumer understanding is going up, as the models are getting better and cheaper, consumer willingness to pay for AI products is also going up.

Um, and I think there's this like it's this beautiful like you actually could not ask for a better set of four lines on a graph than those four things. If you're building a product, it's cheaper for you to build the product. Your customers are willing to pay more.

There's more customers and the tool that's actually enabling the value creation is just getting better for free. Like you don't have to do anything. It just gets better for you. And like I don't think there's been a time in human history where for builders all four of those things have happened at the same time.

Um, so I'm I'm super like I I literally wake up every day excited because people are building companies um and and all this is happening for them. How do you uh how do you think about balancing, you know, benchmarks and capabilities, right?

Google's consistently um been a leader uh yet at the same time every consumer at this point has experienced a new model coming out performing well on benchmarks you know just sort of like broadly not not talking about any one lab uh but then being sort of disappointed with the actual like experience with the model and I'm sure you've spent a lot of time thinking about this yeah this is one I think for folks who haven't thought about this um evol are just really hard.

It's like such a hard problem and actually like if you abstract evals out of like AI and into everyday life like you become and maybe I'm like too eval pill at this point but I think about just like random problems in life are truly eval problems and like we look at them as this like very human thing.

Um, and actually they're eval. And like the sort of really weak version of this is if anyone's had a job and like have gone through like performance reviews, like performance reviews are an eval like are they a super scientific eval?

Like no, they're actually a pretty crabby eval folks have gone through performance reviews and yet like that is the core foundation of like how human, you know, career growth goes in a lot of ways. Like there's a lot like eval are generally a really hard problem. Um, for AI, they're also extremely hard.

And I I think to answer the question specifically, Jordy, about like what what does that mean for people who have this sort of disconnect between capabilities and what the eval say? I think this goes to like why there's so many like vibe evals.

And if you look at like Ellarina is a good example of this, like Ellarina is basically capturing vibes. It's like how do people feel about this thing? It's not scientific. They're not like not actually evaling that the model is saying things that are true. It's like how do humans feel about this response?

Um, and I think that's incredibly important. Is that the only thing that matters? Like certainly not. But I think more and more you see launchers happening where like the vibes are really important in addition to the actual quality of the models being important. And I do think you could do both of these things.

In some cases there's there's tension, but I think you have to do both if you want to be successful. Totally. How do you think about the different buckets of foundational research that are happening?

uh do you believe in the the data wall the pre-training scaling rule like kind of uh you know hitting diminish diminishing marginal returns? We've talked to a lot of folks who are uh extremely excited about reinforcement learning reasoning going a lot further there.

Uh program synthesis these kind of topics even just like just more tool use let's bring more tools in. Google has a million tools and a million interesting APIs both internally and some externally. Uh it feels like there's a lot of lowhanging fruit there.

Um but what excites you on the research side these days is just let's build an even bigger data center. You guys probably already have the biggest ones in the world. Um but you could obviously go bigger. Uh or or is it more algorithm based?

How are you thinking about the future of the foundation model landscape developing? Yeah, I I think two points. One, I think there's never been more opportunities to push the frontier.

Like I think all those examples that you just described like because the models are evolving more than just being models and like they're actually there's like systems that have tools and all this other stuff together.

Um I I think it it just increases the number of opportunities for scale and for the models to get better. So I think that's like one of the positive things. We'll continue to see a lot of growth specifically because there's so many dimensions that you can make the models better at.

Um but I think if you look at like a an example of this in practice like 2. 5 Pro is actually an example where it wasn't just like RL scaling that made that model better.

Um yes RL was part of the story but like there was also a bunch of pre-training innovation and principally there was a ton of um of post-training post training and pre-training innovation. So I think like it's not that um companies are going to continue to see value created across all of those.

And the the really magical thing is like and this is why like I don't I don't subscribe to the like pre-training is you know dead and all that stuff because the the more work that you can do at the pre-training level those capabilities as you do post-training and as you give the models RL capability it's like uh the capability is like amplified almost exponentially.

So like if you can make the model 3% better at the post-training level, when you actually finally do all the reasoning work and the model like has to reason through really complicated problems, it's like you it's like many multiples of bang for your buck because of that.

So like I think we need to continue to do the innovation across all levels. Um and like that's what exactly what we're doing right now. I I know you probably can't talk about specifics, but uh one of my favorite Google stories is uh that crazy anecdote about the V8 JavaScript engine.

I think the it might be apocryphal, but it's like a bunch of Google engineers go out to like Iceland or something, spend like a month building a new uh JavaScript runtime, it builds node, and winds up being like the foundation of like the Chrome browser.

Um I I'm interested to hear like what is your take on just lower level optimization? Obviously Google's already doing the TPU, but in terms of just uh maybe even just squeezing extra uh performance out of inference, we saw this with DeepSeek.

Uh it seemed like they did not just one or two uh breakthroughs in terms of uh kind of cost uh performance uh you know optimizations, but they did a ton. Uh is that type of work happening at Google? Do you want to see more of it?

Um, is it is it an exciting area or is it just like, oh yeah, that's just something that we're going to happen we're going to have to do. It's going to have to happen at some point, but it's not as critical as some of the other stuff that's happening in the industry right now.

Yeah, you you should ask our our inference engineers who are who are working like 24 hours a day. I think like 2. 5 car has actually been like a fun fundamental example of this. Like all of this stuff matters.

Like I think if you don't if you're not doing it like the especially with larger models and especially with models that have lots of demand like there's no world where you can get away with not putting a large order of magnitude of investment into into inference and like I think credit to our our team who's like actually working around the clock right now to make it so that people can keep scaling with 2.

5 Pro because there's so much demand and we're having to like you know as a uh an a artifact of the constraints that we're under like we're having to innovate and like solve new problems and come up with things to like find ways to make 2. 5 Pro more scalable from an inference perspective.

So, it's uh it's awesome to watch that happen. Yeah. Where where do you want to see more, you know, knowing uh 2. 5 pros, you know, capabilities probably better than anyone else? Where do you want to see more developer activity?

Yeah, I think the the thread right now that has the most excitement is around coding just because like developers love coding. there's like so much the whole vibe coding thing is is a real phenomenon. Um, I I think I continued to be interested in like all the multimodal stuff.

Like there's just so many products and like I think back to early in my career when I trained computer vision models and like deployed them and like the amount of time that it took and the amount of resources to do that relative to today where you can literally just write a prompt and send images or videos to the model and have it do those tasks like with basically you know near or better accuracy than you would get from domain specific models is absolutely fascinating.

Like I think we're we we haven't actually seen that wave of like multimodal startups that are like building on top of this stuff.

Um and I think that includes like audio things like I think the audio ecosystem is like still pretty nent like I think the foundation is being laid for that like you know people saw this with Gemini live people saw this with the chat GBT version of the product that came out but I don't think we've seen like across other product services people actually invest in like real-time audio and real-time video and image stuff.

And I think that's like the next iteration of the UX of how people are going to interact with AI models. And the foundation is all there. It's just like the lag of how long it takes people to build interesting products. Like it just takes like 12 months or something like that for that to happen. Yeah.

I mean speaking of like building new products, uh I love the Paul Buko Paul Bukite story of building Gmail is this like almost like April Fool's uh project. Uh and then you see a you see a seat of that in the notebook LM project. Um, what's the culture like around the idea of like 20% time? Does that even exist anymore?

And then like like if if I joined Google, could I just go and say like, "Hey, I'm gonna go like build an AI native uh email client. " Uh, it might destroy Gmail. Uh but you know, we'll figure it out.

Or is that something that like you know, we're going to need to think about because like Gmail's mature, but there's still like a you know, right now Gemini is like kind of being vended into Gmail, but you know, maybe there's an entirely new paradigm at some point.

Um h h h h h h h h h h h h h h h h h h h h h how do these like side projects and 20% time work at Google these days? Yeah, that's a great question. I think my my 200% project is is on all the Gemini developer stuff. So I I do think people are doing 20% projects which I think is great.

Like I think if you're if you have freedom to do that like you should and that's how Google's going to come up with innovation. I think for notebook LM specifically it came out of uh Google Labs and like Google Labs the whole point is come up with new product services.

So not just uh not just notebook LM actually but there's like a whole like Whisk if folks have seen Whisk it's like a video image generation platform that came out of that. Um actually AI studio the product that I work on came out of labs originally and Josh Woodward's team.

So I think there is like a whole lot of innovation that's being seated out of that group.

Um and again it's like the big bold bets that like you wouldn't see coming out of other you know potentially other product areas because they it just like takes time to build these products from scratch and oftentimes time time that they don't have.

But um I'm super happy because Josh Woodward who leads the Google Labs team now also leads the Gemini app team. So, I think we're going to see that like fusion of all these like new ideas and product spaces.

Um, because him and his team sit so close to like the blank slate, let's build any product to solve problems that people have. Now that they also run the Gemini app, I think we're going to see like a ton of explosion. You should have Josh on. He's uh he's one of my favorite people. Yeah. Yeah, we'd love to.

How do you think about getting helping? How do you think about almost like prompting new consumer behaviors?

I feel like every day on X you'll see somebody that's like, you know, I saw somebody post, oh, I'm doing, you know, basically like running, you know, doing a prompt every day about like a specific industry, like, you know, pull together, you know, the most important headlines and news from this industry.

And I'm like, oh, that's really valuable to what we're doing. Uh, we should just sort of like get into the office in the morning, adopt that, and do that. That's something I wasn't really thinking about.

Um people are so used to like you know this sort of idea of like just getting um they you know they'll like adopt a model for like one specific thing and then they just like do that maybe they veer out of it a little bit. But how do you think about getting um users to just like be more creative?

I'm helpful when a developer builds an application uh to like prompt a new consumer behavior, but how do you think about that uh kind of at at the application layer at at Google?

Yeah, I actually my personal take on this is I think this is a bug of this current AI moment where like the if you look at like what is the ideal case? The ideal case is the models in the products like pull out what they need to from the user in order to create value for them.

And I think if you look at like how all AI products basically work today, you as the user, the burden is on you to create value with this tool.

And I think that just like one inherently like limits uh you know the number of people who are going to get value but two is also just like a shitty experience honestly like I hate like this is my biggest qual with AI tools which is like you have to go for for most AI tools you have to go to and like make this like sizable investment in order to get anything out of this.

I think like a couple notable exceptions to this is like deep research.

I can just fire off the like random question that I that I have about how something works and then I'm given a 50-page research report inside the Gemini app and then I can one click turn that into an audio like a notebook LM audio overview and like that's great because like as a user I don't need to make this like large order of magnitude of investment.

The model does all the work for me and I think as we see more experiences like that like I'm I'm super excited.

Like I'm almost um and this is maybe too like absolutist of like a product perspective, but like I will not build a product that like we have to try to convince a consumer to change their behavior because I think actually the promise of AI is that these tools are going to be able to like pull this context out of you and you should just go and build that product.

And like I think the models are actually good enough in a lot of ways to help you do that today without having to like rely on a user hopefully having the right behavior for to make that product successful.

No, it's an interesting thing where I feel like users are so trained on like software doing a specific thing that this dynamic of software and the sort of application being able to act like a smart friend or a smart co-worker that knows infinite more information about a bunch of different subjects and then can help you accomplish tasks is very different than this sort of and I and I just feel like even even as you know uh uh somebody who's, you know, I've now uh I'm I'm not 30 yet, but I've basically spent 20 years using software that behaves in a very specific way, and I need to just like totally reimagine how to use software.

I I think the the push needs to be though that like you don't have to reimag like I think about this all the time. Like the thing that I would love is the AI tools that I have today interact with me like I already interact with software today. Like shoot me a text, shoot me an email, call me on the phone.

Like I'm already doing that all day.

like you could there are ways I think to bridge that experience gap where like you don't need to convince you don't need to go into a new flow and I actually think if we've if we fast forward like 10 years I do think there's going to be a lot of those experiences which like look eerily similar to the way that they do today because it's just like so ingrained in like human culture like how like texting is a great example of this yeah like I want a push notification from Gemini that says hey you're talking with Logan later be sure to ask him about this funny story you know and then it's like boom and I didn't even have to I didn't have to everything in AI we're kind of reinventing from first principles like even just like the idea of like the cron job where like adding that back in and now like there'll be like a whole news cycle from like oh like the AI apps got cron jobs this week like that's incredible and it's like yeah but this has been around for a long time but it really does transform the experience.

Um I mean speaking of I I want to know more about you some of your specific workflows uh that you're enjoying. You mentioned deep research into Notebook LM. Is that something that you're able to do within a studio. google. com and and run it all there or is there like a copy paste step?

Uh because I've had to do that before where I've been like, okay, I got a deep research report. It's not reading it to me here at least in chat. I need to take it over to speechify and get it to read it to me there. Um how I But obviously AI Studio is a little bit more proumer.

I feel like there's a lot I mean the temperature is there. there's like still some some some buttons that are uh you know almost like developer to uh like uh terminology.

So, walk me through some of your favorite AI use cases in AI Studio and and what you're getting the most value out of so people can just kind of copy your prompts.

Yeah, John, I think this is actually a great reminder for folks like uh AI Studio is a developer product like so we're building it for developers and like the use case that we're trying to build for is like showcase all the models raw capabilities so that you understand what the models are capable of so that you can go and build great products yourself.

like we're actually not so like deep research as like a great example of this like deep research is built on top of a bunch of capabilities that the model has which is like native search functionality tool calling etc etc um and that's available in the Gemini app so if you're you know you want the like polished consumer experience or even proumer experience honestly like the Gemini app has that functionality it has audio overviews it's like a fully baked product AI studio is like give you the rawest possible experience and like some consumer AI enthusiasts like that experience which is why they come to AI studio but like generally we're trying to showcase like frontier capabilities show you the art of the possible so that you can go and build the products that you really like.

Um so I do spend a bunch of my time from like a like doing work perspective inside of the Gemini app. You know canvas is another one of them like vibe coding inside of canvas uh inside the Gemini app is a lot of fun. Um someone needs to vibe code a Google reader. this would be the most viral thing ever. I don't know.

Have you familiar with Google reader? Probably before my time. It was like an RSS reader and and Google shut it down and like it had like a thousand true fans and so they were like so upset about it. I I I was using it. It was fun. But uh you know I understood that they didn't need it. But uh anyways for you.

We'll see what we can do. Maybe we'll maybe we'll bring Google Reader back as a vibe. That would be that you would destroy the internet. That would be the the greatest marketing for for everything that you're doing if you could bring back Google Reader. Um anyway, sorry.

Sorry, you're I I I want to I want to let him finish about his uh his his like most interesting like AI use cases and what's uh what's fun and what's in your what's in your everyday carry in the AI world.

Yeah, I I think the only other one that I'll mention is in AI this is something that's specific in AI studio right now and we'll we'll hopefully have it in the Gemini app and other places is we have this like live API or this real-time mode where you can go and screen share and share and like show this the model what you see on your screen.

And I think this gets to like all the points that I was getting at before about like why is this such a magical experience.

It's because right now the product experience of using AI is I need to go and find all the context that's relevant for the model and get it into this text box somewhere or get it into this list of files somewhere.

And like the beautiful thing about screen sharing is like all of the context that I need the model to do stuff with is already on my I'm looking at it somewhere on my computer. I looked at it today or yesterday or right now like just let me show the model what I see and have it do interesting stuff.

So, I've been playing around with a bunch of like pair programming examples like that.

Um, and just like generally like critiquing work that I've done and having the model sort of watching and with my permission able to see the things that I see and talk to it is is a super cool like very futuristic feeling uh experience. That's awesome.

Uh, last question, switching gears a little bit, uh, do you have any uh, you know, ignoring uh, your work at Google and and on Gemini, do you have any takes on AI hardware? Do we need new hardware devices? Is it um is it an area that you're excited about or or you know what how do you think about that generally?

Yeah, I I think on both sides from an enterprise and consumer I think there's a lot of opportunity and I think in the platform shift like it makes sense to try to build something. I think like does it end up working?

I don't have the the crystal ball but I think on the enterprise side the opportunity is you go in and build you know hardware that makes LMS a lot faster and more efficient. And I think that's warranted and somebody should do that work.

Um and then on the even even yeah consumer side um yeah I'm curious uh you know we had we had like the the the founder of Cluey which has been going viral on uh before you and he was talking about how you know the end state is just being embedded into into the brain uh directly but um yeah I mean I mean you can feel free to pass on this question but the founder of Culie uh he applied for a job at Amazon he cheated cheated on the lead code questions.

Uh if if he were applying to be on your team, you caught him cheating. What's the punishment? Is he on the team or is he out forever? I I think we've actually um been looking at doing like AI assisted interviews and not AI assisted interview. Like I think I think the world needs both right now.

It makes perfect sense to evaluate both. Yeah. Yeah. Kind of an interesting I mean he was doing it as like a publicity stunt, basically trolling.

uh obviously I don't endorse actually cheating on real uh interviews but um it is interesting that like you will have to adapt just like the teacher will have to adapt that was previously assigning research papers that can be one shot by chat GPT no it's an interesting way to test how somebody works individually right single player and thinking about AI assisted as a sort of like multiplayer experience how do you collaborate with other people right and that's like a really good uh eval as you said and in a world where the tools like actually make a difference in how much you do like can you use the tools?

I think that's like a fundamental question that I don't think a lot of people ask in these job interviews like are you AI assisted in what you're doing today and if you're not like you know it's a there's there's a delta in your output if you are AI assisted versus not across coding across every discipline right now.

Totally awesome. This has been awesome. Uh come back. I'm sure you're going to have uh many big announcements this year. Always welcome uh to come on and and jam with us. Thank you for making the time. Bye guys. See you soon. sectors. We got Eric Torrenberg coming in the studio in just a minute.

Uh he's uh Andre Horitz is the latest general partner GP. Uh good friend of us, good friend of the show. Uh known him for years and uh excited to dig into his new role and uh do a little some personnel news segment. We got a massive trade

← Back to story