Medal raises $133.7M to build general AI agents using gaming data for spatial-temporal reasoning

Oct 16, 2025 · Full transcript · This transcript is auto-generated and may contain errors.

Featuring Pim de Witte

Great to see you. We'll talk to you soon. Cheers. Have a good one. That was one last general intuition. We got to hop on on with we got to hop on with London in just a few minutes, but Pim is in the ream rating room. And now he's in the TVP and Ultra Dump. Welcome to the show. Sorry for keeping waiting.

Give us the news. Yeah, we've raised $133 million. 7 because we're gamers. 7. Yeah. Uh to build um general agents for environments that require deep spatial reasoning. 1337 lead. This is a gamer reference. Uh tell me about your Didn't you make money as a gamer at like 18 or something like that?

So I uh so I grew up with Tourette's. So I didn't really do uh much other than playing video games as a kid. And um built the largest private server on Runescape when uh when I was a teenager. Did about a million and a half in revenue by the time I were 18 years old. Yeah, it was pretty funny. And uh that's incredible.

I mean people come by with uh you know hund00 million fundraisers all the time but that is that is extremely we don't we don't see 133.

7 million seed that is that is special so thank you so much for stay taking the time um let's actually dive into the company tell me more about what you're building yeah so um we're building general agents for environments that require deep spatial temporal reasoning so what this means is like look at drones for instance robotic arms they all ship with game controllers already so there's this general interface for um applications that's already in all the robotics where we don't need to reinvent the wheel.

Um uh you know um and so the bet that we're taking is that we can train these these foundation models on uh so much diversity represented in all the gaming data which increasingly looks more uh realistic right as physics engines get better and that we can transfer to novel environments with very very minimal new data.

Um and we're seeing you know we're seeing the scaling for this happening now. Um it's very very clear we can do this one application that we're going after. So uh after doing Runescape I worked at Dr. Borders.

Um I spent three years there and um so one of the applications that we're excited about is search and rescue drones which is basically you know decoupling the need for humans to even be observing um uh and you know can basically um uh cover a lot more space a lot more quickly uh if if if you have general agents that can that can navigate these types of environments.

So the future of search and rescue is in your vision like drone swarms that get let's say somebody's on a hike, they get lost, you you'd be able to launch a swarm and cover, you know, 100 square miles in uh an hour.

Yeah, I suspect there will still be like VLMs on the other side initially that sort of analyze footage and sort of process information. But yeah, I think the the bottleneck is just make these things not stupidly run into things to be honest.

It's not uh that's uh so that's what we're focused on right now is just general navigation for novel environments in devices that have gaming inputs represented specifically. And that's the key, right?

Because you have these inputs that already gamers use to control in all these like very diverse environments inside video games, including things like drones, right? That then uh that then um the agent only has to adapt to an environment, not a new action space. Um and so the bet is that that transfers.

What are some other specific categories that that are exciting to you or types of companies? Uh yeah, so so so look at it this way. Um the model sits in a general action space, meaning that it has an understanding of um how different actions relate uh to the world that it's that it's acting in.

Um you have to make sure that when you transfer it over to the physical world that you've solved for safety, right? And so what we generally do is we um we deploy agents for instance in video games uh that have a larger action space that can do loads of things because you don't need to solve for safety as much there.

And then when you when when you can verifiably prove that you deeply understand a specific action space such as navigation, right? Then you start transferring those actions over to um to the physical world.

And we have we have sort of an internal joke that we think maybe simulation might actually be larger than the physical world because there's only one physical reality and there's many simulated ones. I like that. Um and so yeah.

So, so our take is so we're already working with game developers to get these um to get these models deployed into their games which which means like much more fun playing video games. Um and then uh it's also a great flywheel, right?

Because if we can verifiably prove that these models are performing great against game players, then um you know that's that's a great benchmark to also start transferring out to other environments.

uh walk through the usefulness of something like Unreal Engine versus uh I'm going to mispronounce it, Gaussian splatting and then a Genie3 style like generative word world model where you're generating the flame the the frames on the fly.

Uh we've heard about a lot of companies using any all three of those mixing them together using focusing on one or the other. Do you have are you particularly bullish or bearish on individual one of those technologies? I I thought about this question a lot.

I I went through the depths of trying to actually uh start building my own physics engines to like actually understand this from first principles. So I'm going to try I might take a little bit. Let's give it up for first principles. Yes. Love it.

Uh so so um the most computationally difficult things to simulate are high degree of freedom agents. Uh and that grows exponentially as you have more agents in an environment.

meaning that at some point in simulation you just have you hit a point where you just have to bet on video transfer right and um and so and also there's then at the same time for instance there's loads of things that we currently do not have video footage of which means you cannot really bet on those things right and so like um like cells and like you know smaller smaller things that like aren't well represented those type of things right so so you always have to use a combination of the two um you always going to have to use a combination of two the other thing is that when you sit inside um like generative role models like Genie for instance um you're not fully in verifiable domain, right?

And so the problem with to be clear, this is an incredible breakthrough, right? But but um there's very much use for engines like Unreal. And I think I don't think actually I think where we end up is it's actually really annoying not to have that like determinism in these world models, right?

Because you want to be in some form of verifiable viable domain. So I hope that somebody manages to create some form of hybrid architecture like okay so for instance like GBT like like you can hallucinate all the text but then you can RL and then you have oh write some Python.

So if you need to do a complex math equation you draw Python right? Yeah. Have you guys ever watched redstone CPU uh videos on Minecraft? So there are videos on Minecraft where people actually um build redstone CPUs and there's also now they build the types of code.

So my point is, you know, we we might actually end up finding like really interesting um like my my dream is to be able to like simulate a CPU inside a world model where you sort of have some form of determinism and uh and not and again this is not at all possible or near postable today, but my my point being like maybe maybe a world model can Yeah.

Uh and so yeah, so so I I think you need both right now. Uh you want to bet maximally on um uh world models and video transfer for uh hard for things that are hard to simulate and then maximally on simulation for the things that you want to stay in verifi verifiable domain on which is most things in my opinion.

That makes a ton of sense. Uh thank you so much for coming on the show. Congratulations. Uh just very fun conversation. We'd love to love to have you back on. Anything happens in the news that you're excited about, just ping us. We'll have you on. I'm sure. Uh, we'll talk to you soon.

Have a good rest of your We'll talk to you later. Uh, before we hop off, let me tell you about Wander. Find your happy place. Find your happy place. Sorry to to let to like to leave you hanging there for a second. Oh, you're good. Oh, no, I'm fine. I didn't even notice.

Book a wonder with inspiring views, hotel, great amenities, dreamy beds, top tier cleaning, 24/7 concier service. It's a vacation home but better, folks. And good luck to Regav in the chat. He is going to a job interview and we wish him the best.

Leave us five stars on Apple Podcast and Spotify and we will see you tomorrow. Thank you for tuning in. Can't wait. Goodbye. Have a great evening.