Runway's Gen-4.5 tops the video arena leaderboard, beating Google and OpenAI with a research-first, taste-driven approach

Dec 1, 2025 · Full transcript · This transcript is auto-generated and may contain errors.

Featuring Cristóbal Valenzuela

we bring in our next guest, let me tell you about Figma. Think bigger, build faster. Figma helps design and de development teams build great products together. We have Crisal Valenuela from Runway in the Reream waiting room. Let's bring him in. How are you doing? Good to see you again. Thank you so much for taking the time to come talk to us on such a big day. Uh kick us off with an intro a reintroduction on where the company is today and then the news. I'd love to know about the news.

Yeah. Yeah. Thank you for having me again. It's been a while. Uh yeah. So big big news. We just released uh our latest Frontier model, Gen Runway Gen Gen 4.5, as

it's a model we've been working on for like quite some time. It's the best video model right now in the world, which is a pretty remarkable fit.

Yes.

So, I think it's um it's it's it's pretty good. It's pretty fun to play with.

That's that's my audio. I'm adding not not that. But perfect timing.

But but but let's play some of the video. I want to see uh this uh the the demo videos that you put out, the examples, and I want to ask you a bunch of questions about it because uh it's a it's an extraordinary claim. Google is a serious company. They have a very serious asset in YouTube, and I'm fascinated by uh so first give me give me the uh the news uh video arena leaderboard. That's the that's the ranking that you're using. How is that scored? How does that actually work? So it's a it's kind of like a a way of crowdsourcing like performance. You basically ask people in the internet to vote against two videos and it's anonymous. So you vote left or right and then as you keep on voting you accumulate more votes. Um once you vote you can see like who you voted for but before beforehand you don't know. Um and so over the last couple of months we've been working for like this entirely new way of I would say training both uh video models and image models in such a way that hopefully we thought it would like out compete uh others in the arena and and we got results a couple days ago and yes we managed to basically out compete uh all other video models including both Google and OpenAI which is which is a very remarkable feat if you think about the scale of resources like um I think it's the era of Ilia was saying this is the era of research again and I agree. But it's also the year of efficiency. Like really good, really focused teams with highly efficient like you know mandates can get really far.

Um and so yeah.

Yeah. Tell me about what you optimized for here because uh Sora seems it's an incredible model and it it was for like a minute like whoa really mind-blowing. Then I feel like I kind of developed an immune system for it and I can clock a Sora video and it feels like Sora was very much trained on Tik Tok almost or vertical vertical social media video. And so what have been the breakout Sora videos? It's been a lot of uh dash cam footage and uh doorbell uh nest camera footage and facing videos the model dramatically. they have degraded the model a lot. Whereas V3, it felt like it had uh it it it had a little bit of the Hollywood polish, but it was more like Michael Bay when I looked at it. It looked very saturated. It was cool. It looked good, but what you went for it feels a little bit more I want to say cinematic, even though that's kind of an overused term, but talk to me about what your goal was or even if you if even if you have a goal when you go into a training run like this.

It does. So I think there's an explicit goal and an implicit goal. I think in a way all models, specifically video models that are more visually like clear or like perceptible have some sort of personality behind it. And I think that personality reflects a little bit both the point of view of the company and like the way you want to train the models in the first place. to to your point like if you want to make like like consumer slob and like quick like sharable stuff you're going to train the models just from the ground up very differently that for the stuff that we're trying to do which is a much more professional like high quality very controllable set of like tools uh and so a lot of what you're like basically outlining is I would say the personality of the models in somehow also reflects the personality of the companies like if you're trying to sell ads you're going to do a very different model from from if you're trying to bake creative tools. Um, and so I don't think there's one single recipe or one single ingredient. It's more of a just like taste. Like I think that word gets thrown a lot in research today, just taste.

And I think taste is both the research like what do you want to work on like having vision like having okay I want to pick this specific problems I want to work on and this is how we're going to solve them and this is what we've learned over time. That's one form of taste and the other one more aesthetically is like what things look good like and that's on a construction site.

This is actually very pure taste. That's pure taste.

Look at hilarious,

right? Look at the motion of the donkey moving like the camera, the angles like the amount of data creation our team of artists and like filmmakers and like people have spent. It's not it's not it's not like trivial to be honest. I think that's also the taste component like shots like this. It's like

some of this is horrifying. I mean I guess that's the point.

Really had to summon the demon on this one. Are have you been inspired by anthropic at all? It feels like somebody could put you in the anthropic for video bucket and that like they're just like extreme focus on code and and ignoring everything else. And meanwhile, your competitors like are putting a lot of resources towards this, but they're not betting their entire business on it in the way that you are.

Yeah. I think it's a it's like a a mercenaries versus like visionary type of like I would say bet. It's like you want to have people who who who feel like very committed to the vision long term and the way you do that is like you're very focused on like the culture and like that culture eventually shines in the product. I think entropic has also that you can you can tell like who works there and like how they think and it's it's all very co cohesive in a way.

I think we spend somehow a similar amount of time like doing that in a way and and I hope you can tell via the models themselves that like that personality comes across nicely as well. Um yeah and and I agree like you don't that at the end will be perhaps the most defining part of the companies that like stay in the long run like I think if you just throw money at the problem you're not going to get too far to be honest.

Yeah. Um what went into the actual training run? Are you at a are you at a scale now where it's a meaningful capital investment to build a model like this? We saw the scaling paradigm change from like, you know, maybe it's hund00 million to do a big frontier language model run. Then we were talking about billion dollar training runs, bigger and bigger training runs. The results are remarkable, but has it been a remarkable amount of investment to get here or are there more efficient ways to actually uh get to a frontier result without spending frontier money? Yeah, I mean uh it's definitely not cheap like this is not like traditional SAS like you know like so you definitely have to spend more more money more resources but uh I think we've proven that like we are not spending tens of billions of dollars to get there and to like overcome the challenges and look to be honest like the model is not perfect there's a lot of things we're going to improve and we're going to fix and we're going to do larger training runs and we do more over time but it's kind of a I would say the expense the most expensive thing is like the natural intuition the team builds around what kind of works and what doesn't work. It's kind of go back to the idea of research states like you can't throw money at it. You just have to spend enough time. We've been working on Rhino for almost a decade.

And so there's a lot of you've learned over time about what works and what doesn't that informs a lot of the efficiencies on training. And yes, like expensive models will like you'll need more money to train larger and bigger models. Like if this is the worst the models will be, imagine them in like two years. Like you're going to get there by training larger models for sure, but also knowing how to train them in the first place. And that's the part that I think is hard to quantify per se. Um, and and what I'm really excited about is is not only what the models can can do, but also like the efficiencies are not only on training, but on inference. Like this is a price point that's very comparable to our previous models. So, it's actually very usable and hopefully you'll be using it in real time very soon. And so, that level of I would say efficiency at inference level, we haven't yet seen it. And and and I think we're we're going to get there very soon.

Yeah. Fascinating. I mean, some of those videos are very, pretty remarkable. Uh your unlimited plan uh is includes uh 2,250 credits monthly. How much video can one actually generate with that?

Well, technically unlimited.

Okay. I was confused because it said it said there's still like a a credit credit system, but

No. So So we have a queue. There's like we we have like compute and you there's a queue and you get into the queue and you generate as like the queue becomes available. If you just want to generate like fast, you pay for credits. So, but but depending on how anxious you are with like your generations. So, it's a measurement of how how fast you want it. Um, but eventually you can just literally generate unlimited. It's it's by the way I think no one else has a plan like that. It's pretty a pretty good deal.

What is uh uh like what are the length of generations that are that are most commonly being done today? And is that a metric that you track? Like are people consistently is it is it like a 20 second scene that's the most common today? And are you trying to get to two minutes or two hours? Like how do you think about

duration? So well technically you can do you can do like arbitrary durations if if you want it but like the average scene duration in like a short film or a movie is like actually two to three seconds long at the most and that's actually been trending down like the scene right the scene itself it cut is like two to three seconds long on average. Um, and so when when you actually when when people mean like I want to join a 45 minute like long thing, you don't want 45 minutes of like one camera like fixed.

You want like scene cuts and world and you want the character like a shot, a medium shot, a long shot and you have, you know,

like that's a different problem from like creating one continuous long sequence. So the long one continuous long sequence for me is less interesting than the like multi-shot approach where you can create much more compelling like narrative work. And I think we're not that far away from that being a reality where like you can generate consistent narrative work like really good visuals really good stories like with the level of quality of the videos that we're seeing right now here but they're all tied together in a way that just makes it feel like cohesive with each other you know.

Yeah.

Um and so that's a different problem I would say altogether.

Yeah. Uh there was some debate on the why doesn't the cursor for video exist yet? Do you have any any thoughts there?

What's the cursor for video?

Basically a nonlinear editor like a Premiere Pro, a Da Vinci Resolve, an Adobe After Effects for video, cursor for video, like replacing the actual bones of the software that the that the editor that the video creator uses. Uh there's been a couple apps that have spun up uh Runway originally the reason I was using it back in the day was for green screen uh for for Chrome basically.

Um it was fantastic for that and it feels like that uh building a canvas building NLE uh that feels like one potential pathway to victory. Uh it's way it's also very difficult because you can't just fork VS code. There are no leading open-source NLE's. On the flip side, uh when you if you wanted to play nice with Adobe, you could be a vendor all the way Nano Banana is now vended into Photoshop and that could be a solution and um you know there's a variety of ways to to win. I'm interested in hearing your approach.

Yeah, that's definitely definitely an interesting question and uh by the way, shout out for you for being an OG on runway since what, like 20 2019.

Yeah, something crazy. I love you.

Yeah. So um so so uh well my two thoughts are first uh the art of like NLE and editing and film it's an art and there's a lot of like pacing and like details that are very nuance and specific it's it's it's about granular details and it's hard for I would say model to or assistant to automate that level of like decisions that's on a purely NL side right but I would say at least for us more interestingly is the question of like do we need an N analy in first place, right? Like, do we actually need this primitives? If you think about nonlinear editing, this idea that you're like stacking frames of video against each other and like you're cutting them.

Before it was with physical racers and now we have deter racers. You're cutting things together.

My bet is that you probably won't need like anise like that whole paradigm will feel like a fax machine like in a few more years. And so I feel that's somewhat what's happening with like the the the the Devons and the clogged codes and the codeexes of video. Uh I just I do wonder if there's going to be an intermediate step. Um or maybe it'll just be absorbed by the current NLE. I mean I'm sure that's what your customers are are using, right?

Yeah. I I don't know. We'll see it play. But I'm not I'm not too fond of like, you know, pushing like better versions of Analis out there. I think there's there's something around how you make videos and how you interact with this AI systems that just naturally allows itself with different primitives. And if you think also about the fact that very soon you'll start to see this happen in real time.

Like when you make real time like narrative work or videos or experiences, however you want to call them, like you don't need to edit things async because you're generating on the fly and you're have people interact with them. And so it changes, that's what I'm saying. It changes the nature of like those things in the first place. and and there's a transitional period where like you'll you we're seeing like NLE being augmented with AI, but I think it's that's transitory. I don't think it's going to pay out like in the long run.

Yeah. Yeah. No, I think

uh has has Hollywood capitulated yet? What's going on there? We we had uh it's funny. I've been hearing I've been hearing more and more about Sunno from from not just uh guests and friends of the show, but just like random people out in the world. It sounds like every single musical artist now is like using it in some degree, even if they're not willing to talk about it. Uh what what is the the the case in in traditional Hollywood and entertainment? You can't exactly hide that you're using AI video. Uh it's basically out in the open immediately. Uh and there's just so much like so much negative energy that gets focused on it. Uh specifically from people that are within the industry,

you know. You know, I think like the the negative energy is like the water problem with AI, you know, like it's it's kind of this this unrealistic like and very noisy, not representative sample of what's actually happening within the industry. If you go to LA, if you speak with the agencies, with the talent, with the filmmakers, with the studios, with the production teams, they're on board on AI like years ago, like months ago, like they're fans, they're using it, they understand it. Of course, there's pockets of people who are like like more advanced than others,

but I would say that the the narrative publicly hasn't yet catch up with that. I mostly because might some people might not want to speak about it or it's much more interesting to say like all the negative things that to think to say the positive things. Uh I would say Hollywood already has overcome that and they've they've they're pretty much on board. I would say gaming companies are now where Hollywood companies were like a year and a half ago or two years ago. So that's I would say an industry who's now catching up more to what AI can help them and how they can use it. Um so yeah, I would say the some of those naries are a bit fake to be honest.

Yeah.

Well, thank you so much for taking the time to come on the show busy day. We appreciate it and uh I can't wait to play around with the new model. We have a benchmark here, Bezelbench, where we try and recreate a very complicated shot from uh that we shot practically with uh a bunch of different watches uh with our with our intern or gap semester, Tyler Cosgrove. Uh and it has a very the shots very long. It pulls out. It twists around. It's it's a pretty complex shot. And that's our current benchmark and we'll be testing and we'll let everyone know how it goes. But thank you so much for taking time to come chat with us.

We'll talk to you soon update. Goodbye.

So,

uh, let me tell you about Julius.ai, the AI data analyst that works for you.