Max Schwarzer on GPT-5's post-training breakthrough: rebuilding the stack from scratch to cut hallucinations

Aug 7, 2025 · Full transcript · This transcript is auto-generated and may contain errors.

interpretation. And I think it has a draw this with like SVGs. Anyway, we can talk to our next guest about it.

Last post Jirro Ticket says, "I went to the permanent underclass party and everyone knew you."

Anyway, uh back to the serious interviews. Welcome to the stream, Max. Good to see you. How are you doing?

What's happening?

Nice to meet you guys. Yeah. Uh doing well. It's a relief to have this launch out in the world. I think it's uh you know, we've been working on this for the last few months now and it's exciting to let the whole world see what we've had.

Yeah,

just a few months.

It's been uh I don't know. It's been a little while.

What's the what what's the actual launch day like? Because you're actually getting this out into the world. uh the GPUs are on fire or about to be on fire warming up but uh is that is that out of your purview? There is a different team for that fortunately.

So right so I run uh I ran a lot of the research for GBD5. I don't necessarily handle the deployment but I do get dragged in when the GPUs are on fire. Um I I think we're we're moderately burning right now.

Okay. Okay.

Like a two alarm fire.

Yeah. Yeah. Yeah. Uh is it is it materially different? I mean, this is a launch day, but uh we'll probably discover like the Studio Gibli capability once it gets out into the long tail of like, you know, hundreds of millions of people try it. Someone comes out with some genius thing, then everyone's doing that and then the GPUs because I I feel like the Studio Gibli thing happened like a few days after the launch of images and chat GBT like

it. It did. It was um it was pretty fast, but it within about a week. I think in this case, we're going to we're going to see that here.

Okay. Um, I think coding, you know, if I had to take my bets for what the Studio Gibli thing is going to be, it's it's coding. Um, that's the place where I think GBD5 is like most tangibly a huge leap ahead of GPD4 and ahead of 03. Do you think there's a chance that that that the coding will mean a Studio Giblly style uh meme or or kind of like uh and and what I mean by that is that is that like image generation is incredibly valuable in the context of like Hollywood will be using AI to chroma key and and rotoscope in a professional environment. But yeah, what was special about Studio Gibli was that anyone was making these custom images and I could imagine a world where, you know, even going from like the levels IO example of like I vibe coded a flight simulator. If we wind up in a studio Giblly moment for coding, I would imagine it's like everyone built their own game today.

I think that's pretty much it. Yeah. So I I don't know if you guys watched the live. That was that was one of the things we had on the live stream. Like you can just go into chatbt. Um, if you try it right now, it might or might not work because the GP5 rollout is still ongoing. But if you have five, you can just tell it like basically make me a game.

Yeah.

And it will make it and you can actually play it in Chad PT.

That's amazing.

So I Yeah.

discover that. And you don't The thing is like like with Studio Giblly, right? Like for for Gibli, you don't have to know how to draw to make it work. For this one, you don't have to know how to code.

Yes. But

can you share Can you share that chat and someone else can play the same game? How how does the kind of sharing mechanism work? Yeah, there's a there you can do the share link. Um we're we're I think going to try to make sharing for these a lot better over the next few days. That was P2 after the P1 and P 0 of making the GPUs not completely melt.

Yeah. Yeah. Yeah.

But yeah, I we will try to make it much more terrible.

Yeah. Yeah. I mean the the Studio Giblly thing is so interesting because it was it's not just that the model capability was there but it's also like the prompt was two words and and and it was so reliable that you always got a good result and you could personalize it. So even if it wasn't like I've seen people build Doom. I've seen people you know you can just buy Doom. It's a real game. You can build it. Um, but if you build it and I'm like, "Oh, that's cool. He did it in a vibe code environment or in chatbt, like that's awesome, but I don't necessarily want to go do that for myself." But as soon as it becomes personal, which is what the studio, like I had to see what I looked like as Studio Gibli, I had to see what my favorite photo looked like, my favorite meme looked like in Studio Gibly. And once that happens with games, people will will eventually uh, you know, there'll be this like mimemetic explosion and you'll see the GPUs will truly be on fire.

Yeah. I mean, I think even today you could probably with GBD5 do Doom, but all of the characters are like all the enemies are head shot of your friends. Like,

here we're going now. Now, we're real close. Yeah, we're real close. It's going to be something that's personal, something that, you know, you can express your own creativity through because I think people, they still latch on to that. They don't just want, you know, a copy of what already exists. They want something new. And and and the Studio Gibbly moment was just new enough. Anyway, we should talk about actual research. We should talk about post training. Um, what uh what's the thing you're most proud of? like like what can you give us on without you know immediately getting poached what what can you give us on on the actual innovation that went into GPT5 from a post post training perspective what are like the the kind of keywords and and paths in the tech tree that we should be digging into over the next few years to understand how this works

you know I would say the thing that is most impressive to me about GBT5 is how much getting all of the details right matters

like when I look at GBD5 You know, we had an early version of this thing a while ago that was kind of okay, but clearly did not meet our bar for revolutionary. And we're trying to figure out, you know, why why is that not as good as it should be? And the team basically just went off and did a deep dive over a couple of months of just completely rebuilding the post-training stack for this model. And turns out that when you do that, you get what would have taken, you know, another order of magnitude worth of pre-training improvements to to produce. How much are you thinking in pro training in research about let's forget the benchmarks and just focus on user satisfaction like NPS score basically or like user user minutes or any of these other uh the real benchmarks.

Yeah. The intangibles revenue people using Yeah. the feeling and the and the joy and the actual value that's delivered. Um because Studio Gibli was a delightful moment. It wasn't a benchmark.

Yeah. I I think so that was something that we took very seriously for GBD5. It's like look at what people are actually doing with tragedy and look at where the model is failing them.

Mhm.

Either in the sense that the model is like sort of like you said it's not enjoyable to use.

Yeah.

And so we we did I think make a lot of progress on that. Like GBD5 is much more engaging than our previous really smart models. Like 03 I I don't know if you guys talked to 03 in the past. It's a bit bland.

Sure. And GBD5 I think has a lot more character is a lot more more interesting. But then also like I think for we really care about just actually being accurate like giving if a user is trying to do something economically valuable with our model. We want to make sure it lands correctly.

Yeah.

And so what we did there is just like look at act the actual distributions of what people are doing with our models in the real world. Figure out where the models are going wrong. Build interventions to target it. And that was where, you know, we got, I think, the most impressive improvements in GBD5. Like 03 would just get things wrong and not tell you it wasn't sure it was it was incorrect. And GBD5 is much much better about like actually being honest when it thinks it might not know.

Yeah. How explicit is are all the different pieces of the post-training pipeline? Like you have you have uh you know safety postraining, you have uh stop hallucinating, give me the real facts. you have make sure the text the the the flavor the tone is pleasant. Um there's so many different things to optimize for. How much of that is like try and just blend it all up into one thing versus like explicit passes chunk it out like split it up. H how how much can you decompose the problem? So, you know, my background is in reinforcement learning. And I think you when you look at something like this, the magic is in the reward function, right? It it's in what you're actually telling the model to be good at.

And so fixing things like hallucinations to a huge extent is a essentially a function of just fixing the reward function,

actually making it so that the model is reliably penalized for saying something that's false. And if you do that, all of a sudden the model stops saying things that are false. Ditto for safety, right? Um, you know, on the live stream, Sachi talked a bit about the the way we've changed safety for this model. And to a huge extent, it's it's just a function of we we're actually putting out a paper today on the new safety stack for this model. And the core insight in that paper is just figure out what you actually want to optimize for, which in our case is helpfulness conditional on not saying something that's actually dangerous or harmful. You know, write that down, figure out what that means as a reward function, then optimize it for it.

Mhm. It's really not magic at all. It's just again it's it's what I said earlier. You got to get the details right. You know, if at any part of that process you screw it up, the model will be unusable. What's your current thinking on spiky intelligence? And is there some is there some flywheel that you can get started where you're identifying low points that aren't spiky enough and then you're like almost automatically setting up the infrastructure the eval to then RL against to create a spike.

I think GPT5 was a preview of what's possible in that respect in the future. Yeah.

A step in that direction. Do you think that there's a world where you get to a place where you're kind of It's weird because we're not hammering down the nails of the spikes, we're adding spikes, but um this weird metaphor that we're stretching a little bit too far, but uh is there a world where um where where you can be doing post- training or just adding capabilities in a more iterative uh cadence so that as soon as you identify something, the the response can be yeah, we don't need to wait until GPT6 to fix this. we can just add this capability because hey, we just found a pocket of users who are who are trying to do a thing and they're not super happy with the results and let's add this capability.

Yeah, I I think so. Um I mean I think we you know we are going to launch other models between now and GBT6. I think it's relatively common knowledge but we do update the model in chatbt reasonably often.

Yeah, people talk about it all the time.

Yeah, exactly. And you know I think we are now in a world where we can conceivably update that model and have it get materially better on capabilities too. Yeah.

Not just on, you know, the personality is a little bit better than it was before.

Yeah.

Going back to your note on the new uh paper that I guess you guys are releasing today when you talk about optimizing for helpfulness. Is there is part of that avoiding the model? you know, reinforcing. There's times when you want to reinforce and and give kind of like confidence to the user that they're going down the right sort of like thought process and things like that, but then there's like a point where it can get too extreme in terms of maybe convincing a user of something that may be totally untrue. Is that is that what the paper gets at or or am I reading?

We So, it's not specifically about this. Although I will say we we do explicitly train the model to not lead users down bad paths. That's something that I think we've started taking much more seriously over the last few months as we've realized Sam talked about this a little bit um I think back in May, but CHBT is just way more important for people's lives now than it was a year ago or especially two years ago. And we do have to actually be very cognizant of what effects our models have on users. So yeah, we we do we do very actively train models to not lead users down the right path. Don't fact check me on the releasing today. I know we're releasing it. I believe it is soon.

I think it's today, but I've also been in a hole dealing with launch all day.

Yeah, we're not big on fact checks here. We're big on the truth zone, which is just the vibes. And

the vibes are we'll be publishing some information about the new safety setup

at some point. That's great. Um

yeah, I think I think a large part of the conversation around safety should be how reliant and and how useful the product has become to users and then the new level of care that you have to provide versus a while ago when it was just like people saying making a cute image or generating some text that they were going to use in an email or an internal document and and realizing this this vector of usage which is this like companion confidant that uh is is becoming so prevalent.

Talk to me about post training for uh big big partners, enterprises, government organizations. Um what is transferring from the research that you're doing to something that can be uh offered as a as an enterprise level product?

Yeah. So we do openAI does partner with external companies to do essentially custom post training. Um that that is a thing that we do and from that perspective the stuff we do just directly transfers. I'll also say that we we've

put a lot of work into trying to make our models as general as possible.

But to as large an extent as possible if you want to get really good results from our model you can do it right on the API just by actually telling the model what you want it to do.

Yeah.

Right. Like GBD5 I think is pretty comfortably our most durable model ever. Y

we've heard a lot of really positive feedback about this, especially from like folks like Curser.

Yeah. So if I came to you and I was like, I'm an enterprise and I need to generate a lot of Studio Giblies, you'd be like, what are you doing? Just prompt it. Um but what are the examples of of of companies and organizations? Is it just is it just private information, private data sets that aren't available on uh on the open web or is it specifically like there is enough data out there but there's just not the economic incentive for your team to go and RL on you know Gas Station bench or whatever we're talking about here hypothetically.

I think the answer is both.

Yeah, it's definitely both. Um because yeah, we're not going to target, you know, as you said, gas station bench

because what people are doing with not not on our own right now probably because it's not mostly what people are doing with tragic.

Exactly.

You have some application that's super valuable to you.

Yeah. Yeah.

We can be convinced that it's important.

Yeah. Yeah. Yeah.

It's just not what our users are already trying to do.

What's the state of reward hacking and and and fighting that in in RL environments?

Um you know, I think we've actually made a lot of progress. There was some discussion of this around 03 that 03 was like a little bit deceptive in ways that felt reward hacky and

GBD5 is dramatically less deceptive than 03 was.

What's an example of how that would manifest? Like do you have like a canonical uh case study?

Yeah. I mean the canonical thing is like you ask 03 to write you some code and instead of actually writing some code it changes some unit test

changes the test case, right? Which is which is kind of hilarious. It's like one of the funniest things that an AI has ever done. And I understand that it's very bad and it's not what we want, but it is just like it's kind of cheeky in my mind.

It's kind of cheeky. It's also like, you know, I've I feel like if you spend enough time around real software engineers, they do actually do stuff like this pretty often. So

I have 100% done that.

I I was going to say I also have done that. Uh

for for formal reasons, I won't say that I did it at OpenA, but back when I Well, I definitely did that.

Yeah, of course. Of course. This is natural.

What do you think GPT6 looks like? You guys, you mentioned that you're going to be shipping, you know, updates to five,

← Back to story