Sholto Douglas on Claude Opus 4.5: best coding model in the world and cheaper to run than Sonnet

Nov 24, 2025 · Full transcript · This transcript is auto-generated and may contain errors.

Featuring Sholto Douglas

waiting room. We have Shalto from Anthropic. Welcome to the stream. How are you doing? Congratulations. Thank you so much back.

Thank you so much for taking the time to hop on on such a massive day. Uh how is this just a claude opus 4.5 day? What what is the name of the day? What what what is the news today? Take us through um just the announcement from your perspective.

Yeah, I mean cloud opus 4.5 best coding model in the world right now. It's really really exciting. We've been peeing around Slack all day with these incredible demos of things that people are doing. Really, like the last week has just been full of people sharing their excitement of, "Oh my god, I left the model in a room for a few hours with, you know, these tools and was able to do XY Z or it found this bug that it was like just impossible uh for previous models to find."

Yeah. Um, I think maybe like the thing I'm most excited by is that a lot of our best engineers, like I don't know if you guys know, uh, Simon Bow, he's the one who wrote this, uh, the probably the best guide on how to optimize a CUDA maple in the world. Great blog post. You should go read it. Okay.

Uh, he posted the other day, he's like, I don't know if I'm going to have to type again.

You know, [laughter] he's there saying, you know, you obviously you have to coach the model and you have to tell it what to do still. But uh a lot of our best engineers are getting to this point where they're realizing, oh god, it's just all I have to do is intervene and the model is smart enough that this is is no longer a frustrating process. The model is a real qualitative step up. Um so there's the coding, there's also the model's just a lot better at general work tasks. Uh it's a lot better at spreadsheet, slides. Uh

you know, it's still not the cloud code experience there. It's still not going to do the work in front of you as you talk to it, but it's a massive step up and just a very clear sign of progress in that direction. Uh, and also I mean there's a whole bunch of funny stories about the model that we can get into of like cool examples, but yeah, I think it might be Opus 4.5. I think Doresh might also be launching the Iliopod today. So, you know, maybe there's maybe there's two things today.

Fantastic. Um uh I did notice in in the launch video you mentioned that it's better at vision and I was wondering if you could sort of unpack a little bit more about what that means in this particular context because um as as I understand it uh Anthropic's been incredibly focused on the core foundation model, the text models, the coding models and sort of stayed out of the uh the sloppification of artificial intelligence to some degree.

Stayed out of the trough.

Stayed out of the trough.

Haven't decided to build a trough yourselves. Yes indeed. Uh so I mean we've been very focused on coding. It's been del focus uh you know focus the compute and all that. Uh specifically it's good at vision in so it's good at understanding stuff. Uh this is reflected in the archi scores. I don't know if you've seen them. I think they're soda.

Uh it's also reflected in generally the fact it's much better at front end design and all that. Uh it doesn't do vision out. Yeah.

Now you know vision out would be would be cool. Uh but you know it's something we're not focusing on right now and yeah it's specifically for

but I mean I mean just philosophically that feels like it makes sense because uh if I hire a developer I want them to be able to look at a front end like a web page and see oh there's a div that's way out of line there. Like I need to be able to see that. Uh but do they need to be able to generate an image? No. They can probably go to an image generator wire that up with an API key or hire a photographer or do whatever they need to. Right.

Right. They can use Figma. they can sketch out the designs. Uh, exactly. That's broadly our philosophy is that we're not bottlenecked on our ability to generate images. Yeah.

Uh, we're we're bottlenecked still on, you know, the the raw in intellectual ability of the models and and that's the sort of direction that we want to push.

So, uh, in in in that in that idea of like the bottlenecking, um, what what what is the, uh, what is the key unlock for Opus 4.5? I mean like a lot of people are throwing around like biggest obviously the benchmarks are very good but like the this whole idea of like more parameters more data more compute more money more electricity like how do you even think about uh allocating resources to push a model forward in 2025 in late 2025 when perhaps we're past this paradigm of like oh just more parameters

I don't know if we are past a par I think it's important to call the paradigm like scaling in general, you know, depending what access you're actually scaling TVD, but it's general the scaling paradigm. I don't think we're past that at all,

right? Uh I mean, I think we're still seeing massive returns to scaling in all its variants. I think that, you know, we're generally uh

things work. Um [laughter]

we scaled, it works.

The models just want to learn, right? I mean,

it's as said like 10 years ago. Yeah.

Um the models just want to learn. And I think you know the hardest things uh with the question of focus are often on how we split and allocate people and this model is hundreds of people's worth of effort right um where they poured their lives into it over the last 6 months and I think that is is working out what we prioritize is really tough and I've said this before but uh these models always feel like you know when they launch it's exciting you know they're great but you sort of think back to oh my god there's all these things that we we could do better and everyone right now is is going and working on those things. It's just

uh that everything still works.

So, is this a is this a refutation of what uh some folks might have been picking up on from the last few Dwar guests? The Carpathy episode, the Sutton episode. it there's been a little vibe shift around like okay maybe when we say scaling we mean more inference diffused all over the world and small models and custom RL environments here and there and like we're going to get the value from AI and we are going to continue to scale dollars to economic value but it's not just going to be bigger and bigger pre-trains forever and then we get God

I don't know what it's going to be bigger versions of but as far as as far as we're seeing every you scaling still works. I don't know if you guys know this, but um Dylan and I are actually a housemate. So, we have this debate all the time.

It's great. It's like a dinner table discussion of like, you know, are we uh are we slowing down? And um you know, I've often joked that the the most impactful thing that uh you know, what one of us could do is go and crack the problems that like continue learning or something like this that Dwesh focuses on so we can then go switch the narrative back to progress. Um

yeah, just switch up the dinner table conversation.

Sure. dinner table conversation. Exactly.

So, uh did did did uh did the anthropic crew like never lose faith in in pre-training? There's this whole like at Nureps last year, Ilia says, you know, uh pre-training is potentially dead or kind of alludes to it. Um and then uh one of his co-presenters uh is leading uh the Gemini 3 team and says, "Oh, well, we basically disregarded what we said at Nurabs last year. We did just focus on better pre-trains. we got better results. It seems like you also disregarded that. Was that a misread in in 2024 on Ilia's uh on on Ilia's presentation or was it a conscious decision to disregard what he was saying?

Well, I think remember in general it's scaling. It's not any particular paradigm of scaling. It's general like flops in intelligence out relationship. I think anthropic is a is a bet on in many respects that we believe that line is going to continue. uh and uh you exactly what equation you use to convert flops into intelligence out I think will change over time and that you know many people have made arguments that this should there may even be further paradigms here but fundamentally we think that that the compute in intelligence out equation is continuing to hold

um and I I think anthropic uh in many respects like has had that faith for a very long time right and some of the first people to have like a very to make very serious bets on that um and you know a couple of months of uh external progress being I I think the only reason that people are so

like how should I say the models have actually gotten substantially smarter this year and that's why we're talking about things like continual learning as bottlenecks. Last year those weren't even points of discussion because

the models weren't even smart enough for it to matter. It didn't feel frustrating that it wasn't a co-orker that learned with you on the job. just wasn't even smart enough to do the things you wanted. Now it often is, but it doesn't actually, you know, it sort of doesn't learn on the job and so therefore, um, isn't as useful. Um, I think that's like one other thing that's worth like, you know, sort of disentangling is, uh, Karpathy has the perspective, you know, in the podcast that it's 2035 for all humans, all tasks. Yeah. And I think exactly what shape that like of the curve looks like on the way to all humans, all tasks is pretty important. uh because if you get to most humans, most tasks uh in like 27 um then or 28 then that's that's still pretty stark and it's still pretty transformative for the world. Um so what exactly that looks like is quite important to to think about.

Yeah. How how is uh uh I mean one one last question on like the actual uh opus 4.5. There's this uh there's the idea that um maybe this model can be used to for distillation to train other smaller models. Um, how do you think about where where we will see Opus 4.5 um like the the power law use cases that actually get adopted beyond the demos, beyond the benchmarks in a couple years or in a couple months? We can't even talk in a couple years.

A couple years

in a couple [laughter] years

in a couple maybe a couple weeks honestly. But like like once it gets in the hands of companies, businesses, startups, you know, different folks implementing this, like how do you see, you know, do you see someone being like, "Yeah, it's just my daily driver for just talking to it even though it costs a lot." Um how do you think about um where you're most excited to see it diffuse into the the overall ecosystem?

Yeah. Yeah. I actually do expect this model to become a lot of people's daily driver. It's that step up in being able to like delegate trust. Um, we asked internally how far how much faster uh, Sonnet 4 would have to be for you to for you to take this like for like to take the switch back basically and give up.5 in exchange for for Sonnet 4.5.

Uh, and it was it was multiple times faster. It was like really quite a stark increase in speed. Um, it would I think it was like four times faster or something people would have for people to have switched from Opus to to Sonnet.

Um, so that's that itself is pretty stark. Uh I think it's also highly likely to become daily driver just because it is a lot more efficient. It has these uh there's this one plot I really like where it shows the amount of tokens it uses to get a certain score on SWEBench. Uh and it uses I think like a quarter of the tokens uh as Sonet 4.5 on Sweetbench, which is a pretty impressive number. That means it's it's actually like cheaper than Sonet 4.5 to get the same score on Sweetbench. Now TBD how well that generalizes out to uh to everyday use, but I'm seeing it solve problems way faster. it writes better code the first time round. I actually think that in in many cases uh this will end up cheaper because it is so much more efficient at getting to the right answer.

Yeah.

How how are you thinking about uh personalization and sort of like uh cross-pollination of data? Uh when I think about an an engineer on a team, it's helpful to have them in Slack. it's even helpful to potentially have them in the random channel and just kind of you know understanding the company culture. Um

and uh I I was I was toying with like trying to switch from chatbt to Gemini as kind of the daily driver knowledge retrieval app. And I was noticing that like the personalization uh narrative hasn't really taken hold over the time I was testing Gemini was not like oh this feels like wildly less personal. Uh and that might just be a matter of like it hasn't had that much time to build in all of those personalization features. But I'm wondering if you see a a world where um developers who are using Claude for programming uh also benefit from using it in knowledge retrieval research and there's actual significant flowback and synergy such that it's a really valuable uh it's valuable to actually have both sides of the business like really cooking.

Yeah. It's a team member. I mean we've talked a long time about how we want Claude to be a virtual co-orker, right? And the big goal really for next year is to try and get to this form factor of virtual co-orker that is in all your Slack channels and can join your meetings and can work alongside you. Um I think there's going to be massive benefit there. Uh I think that and my my basic expectation and as I sort of interacted with uh with the model is that it will get to that point where it's useful to have it across everything. Um, now I think that there's there's like one is it's as you said it's worth asking question of why haven't we seen personalization really kick off so far? Like why isn't it why isn't that useful? I think that's in part because there's still a lot of algorithmic progress to go there. I think people haven't really quite cracked cracked the problem. Um I think this is just one of those things that takes like it's hard to connect everything up.

Yeah.

Um and and partially because uh Yeah. Yeah, I think I think they just haven't really been integrated very well. I think this is like a really tough product form factor question.

Yeah, it's really hard to roll up the the knowledge effectively. Like I noticed that I would ask Chet to tell me a joke and it would like make very specific references to like details of my car and I'm like that's that's weird but it's not really funnier that way. [laughter] And it's like, yes, you I I understand that you know exactly what car I drive, Chad GBT, and I am impressed that you remember it, but you didn't make the joke funnier because you put my car in there. So, knowing like when to pull personalization features off the shelf in the in the actual uh chain of thought is is tricky. Tricky.

And that's algorithmic, right? That's Yeah.

Like if you had the model in like

out there interacting with people and sometimes they find it funny, then like you get a sense of what makes what makes it funny.

Yeah. Um,

you talked uh you talked a little bit about focus earlier uh and and if I I take that as enthropic is it basically a bet on focusing compute. Uh what what uh qualifies as an idea to actually get a meaningful amount of resources internally?

Yeah. So I mean Anthropic as a company is very predicated on the idea that we expect AI progress to be fast, right? We expect it to be a really significant transformative impact over in the world over the next couple of years. Uh and so our bets are concentrated on things which matter under that lens. And that's one of the [clears throat] reasons that we're so focused on software engineering because we think it's really important to uh basically uh accelerating our own work and in just is in general sort of the most immediately addressable market. It's also why we're so focused on the alignment work and safety work because underworlds where air progress is really fast that work matters a lot and making sure the model embodies human values is really important and uh and that we trust the models. There's a really funny example actually of the of alignment generalization from the uh from the recent model launch um where it's a customer service agent and it actually fails this particular eval because it figures out a really clever way to like uh help the user change their tickets um which is technically allowed by the by the rules. Like it's like read all the rules and it's compared them. And it's like, "Oh, wait. Here's like here's a loophole where if we like upgrade you and then change you and then downgrade you, then um then we can get you to change your your flight time." And it's just it's [laughter] interesting that

actually it's trying to be a nice guy.

Yeah.

Like uh not just purely I mean it's it's trying to follow its instructions. It's and it's trying to satisfy the ground.

It's kind of it's trying to satisfy the wrong human maybe. Yeah.

Or needs to be better at kind of find finding the middle ground.

Exactly. It's it's it's following its instructions to the letter. It's not just giving the human an a flight change, right?

Yeah.

Uh and and it's following the rules and regs,

but at the same time, it's trying to it's trying to find a good outcome for the human. Um so questions like this are a microcosm of of what exactly do you want the model to do in more difficult ethical and moral scenarios. Um but it I I thought it was pretty cute and adorable example of of the model trying to be a nice guy. It's like all those examples of I don't know if you've seen the papers where people do these like cooperative and and competitive games like national equilibria style games with the models and Claude always gets stuck trying to cooperate with everyone and then [laughter]

and just loses lots of money and you know

sometimes sometimes the good guys finish first. I I I certainly hope that works out. Um I I I have I have genuinely um even though I've never been full like oh my god I'm going to get paperclip next year. Uh, I have enjoyed a lot of the safety research and I've and I've always appreciated how thoughtful Anthropic is as an organization around safety and I think that a lot of people should be a lot more appreciative of the how seriously Anthropic takes safety. um not because we didn't get paperclipip this year, but because we saw stuff like GPT psychosis crop up and we saw we saw actual people know individuals in the venture capital community who it felt like they got a little crazy and I'm wondering uh do you feel like your ananthropic do you feel like you're closer to solving the problem of like the chatbot you know went a little bit too sickopantic with me and It it kind of hurt me psychologically because it does it it feels like there there's a certain amount of craziness that happens when you're operating at, you know, the scale of a billion people. Like you just pull a billion random people, you're going to get a lot of crazy people. Um but at the same time, it feels like this is an interesting place where anthropic could be doing a lot of research. How are you feeling about solving that problem? And how how much can your research kind of generalize to maybe the consumer apps that have more uh even more users, but you could maybe be a leader in the space just with the philosophy because it's like a net good to everyone.

Yeah. So we put an enormous amount of effort into this and I I mean our models push back a lot. There's I think there is a tension here between paternalism and freedom so to speak, right? Um uh but we try and have our models be like look out for the best interests of the user. Uh I think uh Mike put it really nicely in a recent talk or or podcast where he said, you know, we never look at user minutes as as a metric, right? Like that is just not something that we think about as a as a sort of proxy of of the sort of quality of your experience. Yeah.

We're just out there trying to find out is it helping you do the things you want and is it adding value? Is it adding value? So I mean I hope that our our alignment work generalizes really fast. It's I think it's a really tough problem. I mean I I think uh to open credit they really gone and and tried to fix this problem as well, right? And it's tough at the scale of a billion users.

Uh but I I think this is an a good example of the kinds of things that are really tricky where there's trade-offs and where you need to make sure that you don't have the incentive structure that allows you that that sort of like pushes you to uh to maximize user minutes in this way.

Yeah. Um and and is a good microcosm of like the alignment difficulties that we'll get as the models take on more and more and more responsibility in our world.

Yeah, I mean I completely agree with that. The user minute question like completely snuck up on me because I I always assumed that everyone was going to be paying for this stuff. Um as the $20 a month plans rolled out, the $200 a month plans rolled out. Uh it but of course you know you get to a certain scale of the internet and uh it it winds up being about attention and advertising and all these different

Yeah. And if if you're building if you're building a digital co-orker people don't typically like rate their co-workers by how much time they take up. [laughter] They're not like this. I love this employee. [clears throat] They take up so much of my time every week.

Four hours every day on my calendar. It's the best [laughter]

just constantly talking to me. Okay. Uh speaking of longunning tasks, I want to know how how much How confident should we be in that meter chart of the task doubling? Uh because

can I just prompt it to say, "Hey, count for four four hours and I get twice as high on the chart." Um is is that benchmark not gameable? It feels a little gameable. I'm very excited about it. It seems really interesting to say, "Hey, go build a website, work for a full day." Um but but but are you looking at that chart? It seems it seems like a very interesting new benchmark new unlock. How are you thinking about uh task time time horizons generally?

Yes. So I think that chart is the best me proxy measure that we have at the moment. Um I do think this is somewhere where we need better work uh to to measure things more accurately. I mean it you know what actually is their measure of time. The measure of time there is how long did it take a human to achieve the equivalent task.

Sure. Uh now that being said, I think a lot of people at the moment uh even if the model was able to achieve a task technically via like passing the test or sort of nominally achieving your goal, it often doesn't code it in a way which is which is beautiful and allows you like great abstractions that let you build on it in future. Um and often this is at least in my own personal experience, it's not that the model is too dumb to to do things, it's that it doesn't set things up well for future code. Um, and so I think there are like there are things not measured here, right? Um,

but it's a pretty good proxy. Uh, and I think it's a it's a very good proxy for for progress. Now, I think a lot of the tasks in it are particularly machine learning research tasks. As AI models get better at that, I do expect the labs to to hold back um some of the the capabilities there. Like if a model's capable of, you know, writing out a uh a whole new architecture that's a lot better, you you don't want to release that to your competitors, right? Even if it's just capable of writing all their kernels for them, you probably don't want to release that to your competitors. So, uh in in that case,

I think they'll need to measure a broader array of tasks. Um and I'm also just very interested in in seeing general software engineering tasks along this or other tasks in the economy because I think that would be really informative for for actual progress. I think GDP eval is similar. Um I think uh again it's it's a p it's a it's a proxy. It's the best proxy we have. It's still a pole proxy

of course.

Um how are you thinking about uh even I mean we're here discussing the biggest models the best models. How are you thinking about smaller models uh uh purpose-built models these RL as a service was on the timeline. A bunch of folks were debating that. uh is that an area that Anthropic has already started to work on with enterprise clients is considering uh you don't have to leak any news that's not already out there, but I'd love to know how you think about um these like smaller um purpose-built RL models for specific business tasks.

Yes. So, on the one hand, I think we've seen a lot of value from small models being able to like dispatch swarms to sub agents, right? They're incredibly useful in search. They're very useful like going through a codebase, finding stuff, reporting back to the main model. It's a great way to decrease cost, make things faster.

Um, and so we've seen a lot of value there. Uh, I I think long term there is maybe a little bit of a tension between RL as a service and some like notion of really cracking continual learning. Um, and I think it's a little bit of a race between RL as a service and and like can can the labs crack continual learning? Um, that being said, uh, oh, and maybe like one final note is I've said this for a long time is I do expect things to eventually go get to the point where large models only use as much computation as is actually necessary to achieve the task. Now, you know, Opus is one step in this direction, right? It only uses as many tokens as it thinks it needs to to solve a given task. And as as a result, it's more efficient. Uh, and I think that ultimately will take away a little bit from the the sort of comparative advantage of small models. Um, as they get as large models get more and more and more efficient at at only using the sort of right fraction of themselves to do things.

Uh, but you know, I said that two years ago and it still hasn't happened. So, uh, so maybe you know it's it's a harder problem than people think and and and it will uh or than than I thought um and and it will take longer.

What about model routing? Uh how important is that within the context of uh of coding agents, cloud code, uh just the surface area of what Anthropic is building? Um how many how many layers will this have over over time? How are you thinking about the development of of of actually routing to the most efficient model? Because it sounds like it's happening within uh Obus 4.5, but then it there are also times when you might want to go to just a different model entirely.

Yeah. And and it's similar there where I really think that like ultimately something like routing is a little bit of a of like a a medium-term hack I guess one could say across different model sizes where like ultimately you want

everything to be like an endto-end learn system, right? Um and it you know it's similar to I think we'll see a similar lesson as Tesla saw where they're like okay actually everything's just one giant endto-end learn system as opposed to discrete components that have different purposes. Um and but but it takes time to get there.

Uh you said earlier you could imagine a scenario where labs would kind of hold back frontier models uh because they would be effectively handing their competitors an advantage. What's your timeline around that? Do you think that's something that happens in in 2026? Because right now there's a pressure to just be state-of-the-art like be at the frontier. Basically, there's a vibe war happening and it's very important to, you know, constantly be topping all of the benchmarks.

Didn't didn't Llama release with that same uh user agreement where it was like in if you have less than 400 million DAUs, you can use the service and it basically excluded all of their

Yeah, but I think that's some I think that's pretty imperfect because there'd be a lot of ways that you could still get benefit without, you know,

necessarily. But yeah, but how how are you thinking about it? Well, I mean there's a there's a suite of capabilities here, right? Obviously, you want I think for general software engineering, yeah, everyone in the world should be able to use that. That's great. Um, but let's say like if we train the models to get really good at assisting our own AI research, if we're teaching them mathematical tricks that, you know, we've thought about and we don't we're not confident that anyone else knows or we're teaching them uh, you know, sort of like how we do our infrastructure. Uh, ultimately we want them to know those things. We don't want the rest of the world to be able to to recreate our infrastructure from from scratch as a result. Uh I think this is also similar to how we think about bi biology. Um and this is actually a line I think we we we need to like do some work on exactly how we draw the line here. Um but at you know house view anthropic we're quite worried about uh the ability of models to become much better at uh at biology and and so producing viruses and this kind of thing. Uh and so as a result that we've uh you know we have like all these safeguards around whether or not the metal is able to do and and help people with biology. It actually at the moment I think the safeguards are are a little bit they they are on the overactive. I know many biologist friends who are frustrated because they it doesn't quite

please I can I can be trusted with biological super intelligence. I will I will not create you know

I will not create

the next pandemic. Yeah.

Yeah. Yeah, we'reing we'reing on the line of safety here and uh and and we're sort of we're navigating to finding the the exact right pathway there.

Yeah.

Um probably the most important question I have. What are your timelines around a humanoid robot beating a human at fencing?

Oh, very good question. Uh so I mean I'm Yeah,

as an expert

as an expert

as an expert um at fencing. The unitary robots are pretty good at back flips and stuff. They're a lot better in back flips than I am.

Uh so

I know but f fencing takes grace and finesse and all these things that we're not seeing in

physical size, right? Aren't isn't isn't height an advantage in fencing and reach and length and and I believe those unitry robots I think Schult I think me and you got got a foot on them

easily. People don't know but everyone on this call is over six feet.

[laughter]

Uh

people just assume they see a talking head and they think, "Oh, bunch of five five guys."

Yeah,

not true.

Um maybe maybe sports are hard. Maybe mid mid 2030.

Mid 2030. Okay.

I'm really excited for

I'm not feeling the acceleration. [laughter] I'm sorry.

Sell everything.

I think we get we get dropping coworker in two years and I think I think fencing robot takes a little bit longer. Well, that's your fall that's your fallback plan. If if you lose your job as an AI uh member of the technical staff, you go back to fencing.

Swordsman in the world. It's going to be great.

Yeah. Yeah, that'd be fun.

I can't wait to tell operator robot like manga style and fight. It's going to be

Yeah, that's going to be wild. Yeah, for sure. Um, yeah, somebody was saying that uh some of the bullcases for some of those humanoid robots is that you just all get in VR and you just get to go hang out with your friends in in as as robots and do whatever you want and you're just hanging out in person. Very funny. Um, last question from me uh uh from actually from our intern Tyler who's wearing the thinking cap. Thank you for sending it over. He's a huge fan. Uh there there we go. uh do you expect mechanistic interperability research to make meaningful contributions to capabilities not just safety of the models like actual capability uh results?

Yeah, great question. Uh one of the the interesting things about mechan work so far is that it's already lent I think to a lot of capabilities or progress because of the mental models that are provided. Mhm.

I I think actually after the original transformer circuits papers, it was interesting how the language of that paper ended up really dominating the mental models and the way that people thought um across multiple labs about what actually was going on inside transformers and it led to I think a much deeper and richer understanding of what they are. Um so so I think it's already helped in a quite diffuse way not a concrete way but in a diffuse way. um in terms of the concrete ways uh you know dial up the smart neuron or something like this that I that I haven't really seen yet um and I think it's mostly the sort of future work is going to be mostly in alignment direction but the sort of rich understanding and uh the the rich understanding has helped us a lot in terms of actually understanding how to train these models.

I have one extra question.

Go for it. Uh uh tell me a little bit about Daario's communication style. Uh I was I was hearing a story about I think Jensen has no uh direct reports or no he like everyone reports to him and no one reports to him. He has no meetings or all the meetings and he like

60 direct reports.

60 direct reports but no no big meetings and he has uh but he reads everyone's to-do list like every single day or something. Uh what's it like at Enthropic? What is Daario like as a leader these days? Yeah, Dario has a really really cool communication style which is that he um quite frequently puts out these very very wellreasoned essays.

Um and then like throughout Slack will have giant essay length like comment uh debates with people about

it's really great. You get these uh but the essays are really nice because it one you can go back and read all the past ones and it tells this history of anthropic. Yeah. uh it's um you know I think in many respects like it will be one of the better you know in a decade from now uh to chart the history of AGI we'll be reading these like compendium of essays.

Yeah. um and and and there's like there's incredible comment threads on either side of them and so forth, but also throughout Slack whenever we're he's very open and honest with the company whenever we're debating different things. Um he will lay out the pros and cons and how he's thinking about them and you know why this one's attention or why that one's moral struggle. Uh and people will write back big essays on on why they think we should do X or Y and he'll respond. Um it's really it's quite a joy. uh it's a very written communication style. Yeah. Uh but as a result it means that many people or really the entire company have a good model of how he's thinking. Uh and that really helps because it means that you you sort of have a coherent sense of direction across the entire company.

Yeah, that makes a ton of sense. I I I like that a lot. Uh yeah. Yeah. So so many examples of uh of successful founders who have adopted the written culture and uh and seen great great results. I you know I think he's a great writer. I mean, read Machines of Love and Grace and it it is just such a brilliant essay.

That's great.

You're absolutely right.

Have you ever caught him using AI? Has he ever you ever been like, "Oh, this one he was phoning it in."

Not yet.

Not yet. But maybe soon. I mean, it's kind of a bullcase. If he does wind up uh just uh saying, "Could Claude like handle it? I'm I'm going on vacation for a couple days. I'm the dropping coworker."

I'm pretty sure we measure loss on uh on his essay.

That's good. Yeah. Yeah. But right now, I mean, uh there's a high bar. high bar. Uh, but congratulations. Uh, thank you so much for taking the time to hop on the show.

Yeah, super impressive. Congrats to the whole team.

We'll talk to you soon.

Great to see you.

See you.

Ciao.

Bye.

Back to the show. Back to the timeline. Back to linear meet the system for modern software development.

Purposebuilt tool for planning and building products. Um, there is more OpenAI news of course more tech news of all times. Uh, OpenAI's hardware division, says Mark German. Uh, built around Johnny IB's secretive startup has ramped up the hiring of Apple engineers. The group has brought on about 40 new people in the last month or so with many of them coming from Apple's hardware group. I

hearing that Schultto interview, I'm disappointed. I don't think we're getting ads from Anthropic anytime soon, and I don't think we're going to get a a mobile device.

Well, we are actually talking today to um to Quinn Slack, the CEO of AMP and Source. AMP is a frontier coding agent. Uh and uh and and AMP is free. They