Generalist AI's Pete Florence on building foundation models for robotic dexterity — and why humanoids aren't the whole story

Jun 8, 2026 · Full transcript · This transcript is auto-generated and may contain errors.

Speaker 2: I thought they under delivered, but now they're back, so I'm voting Apple. Yeah. The most biased jury in the world. Imagine. Well, we have Pete Florence from Generalist. He's the co founder and CEO. Pete, how are doing?

Speaker 6: Hey, John. I'm doing great. Thanks for having me on.

Speaker 2: Thanks so much for hopping on. First time on the show. Introduce yourself. Tell us about the company.

Speaker 6: Awesome. So, really the backstory of the company is, you know, goes back to kind of the story of founders and so for myself, I've been working on robotics and AI for over a decade now. I started my PhD at MIT Super nice success. There you go. Yeah. 2014. And it was in in grad school at MIT, I met one of my cofounders, Andy Barry. Just an amazing overall roboticist. He was at Boston Dynamics for five and a half years. Then after grad school for me, I was at Google DeepMind. And there I worked super closely with another Andy, Andy Zhang. We published a ton of research papers together and, you know, eventually it just kind of there was this overwhelming sensation that to like really match the the shape of this challenge to just go build general intelligence for the physical world that we just needed like, you know, to get the get the dream team together, get all the right folks and and have like a really focused plan on on how we actually build and scale this whole thing and that's what we've been doing for the last couple of

Speaker 2: years. Haven't you said the word humanoid yet? What's going on? I Is there more to this than

Speaker 6: I you know, we we think humanoids are are awesome. A lot of people on the team have been working on humanoids for for over a decade like back when I when I mentioned when I started grad school, that was the era of the DARPA grand challenge, you know, back like one of the first sort of big humanoid things.

Speaker 2: This is Rocco's Basilisk. He doesn't want to say anything bad about humanoids. Yeah. Yeah. He's like he's like, they're great. They're great. I'm not really working on them right now, but they're great. Don't want say anything bad.

Speaker 6: There's those two kids

Speaker 2: because when they rise up, you don't want to be on the record saying

Speaker 1: You saw those two kids in China Yes. That got kicked

Speaker 2: Yes.

Speaker 1: By humanoids. You know what those kids were doing right before the performance?

Speaker 2: Talking trash.

Speaker 1: They were talking some smack.

Speaker 2: Probably talking smack saying that there's a different path to robotic to our robotic future. It's fusion. But take us through it. Like what is the future? How do you see this all rolling out? Because I think I think everyone agrees with you.

Speaker 6: I don't I don't know if everyone agrees. I mean otherwise people wouldn't be spending billions of dollars building only humanoids. But I I do think that humanoids are are really of course like a form a factor that makes a lot of sense for a lot of things. But we we just think the the future is is much bigger than than only humanoids and you know, there'll be billions of robots and and some of them will be humanoids but Yep. Yeah, there'll there'll be a lot of other form factors too.

Speaker 2: Yeah. Humanoids are the cars and you're building the motorcycle. I like it.

Speaker 6: I I would say we're we're more like building like in that analogy more and more like building fundamental like engine technology that could be used in motors or used in cars or could be used in planes or

Speaker 2: you know, whatever you So I mean that feels like it gives you potentially a larger TAM in the near term because like as much as everyone says like oh it's humanoid what is the is the ramp for humanoid robots to actually get into the home? There's so many edge cases in the factory. You have polished floors. Why don't you just have wheels? There's all these reasons why humanoids are the supply chain. It could take five years. It could take ten years. But at the same time like Amazon actively has a million robots. Like they just do their Kiva systems. They're rolling around. You don't they're nowhere near humanoid but they're incredibly economically valuable. And if you could be a part of that supply chain, you probably have a great business. But what supply chains that we would broadly define as robotic do you see as being like the most near term consequential to your business?

Speaker 6: Yeah. Great question. So, you know, overall, there's like robotics and and automation has been deployed in a ton of different industries doing a lot of different things. And, you know, we this has been kind of obvious over time to learn from me, but like, we don't want to be doing anything that previous robotics and automations has already solved. We really really want be focused on the things that haven't been solved so far today. So in a lot of cases, this is like industries that already have a lot of robots, but there's a lot of different types of applications in these that just, you know, have not been possible to address with with robots before. So Yeah. Things like logistics, manufacturing, the supply chains for those or or the, know, really just like the the rollout of, you know, operations of those at scale, those are really two really obvious areas. But then, you know, the name of the company is Generalist and, know, very much like having these models power robots in a ton of different industries, ton of different applications. That's really what we're doing.

Speaker 2: Okay. So I have a calculator app. It's written in Python on the back end. It works fine. I could replace that with an LLM and just ask the LLM to guess the answer to every math question I give it. That would be inefficient and not really an upgrade. It's possible. My question is let's say I have a factory and I have a bunch of CNC machines with robotic arms that are on deterministic paths. I have some KUKA robotic arms that are all preprogrammed. They take the glass and they put it on the windshield of the car as it rolls along. I have a bunch of robots that are programmed that and they're doing their job. They break down. There's all the usual things but why do I want to go from deterministic control and operation of my robotic fleet to something that's stochastic AI driven?

Speaker 6: Yeah. It's a great question. So some some things, John, like, yeah, yes, and and those types of applications that you mentioned, those are those are already solved by robots today. Like, we already have welding robots.

Speaker 3: Yeah. Right?

Speaker 6: Yeah. But there's a lot of stuff that even in, like, very structured roboticized environments, like, it's still very challenging for previous generation like programmed robots to solve. Yeah. The easiest example in like auto manufacturing is like wire harnessing. Oh. So the reality is like things like wires or lots of different like finicky, like easy for people to deal with types of objects and applications. These are just out of scope for like traditional, you program the robot to do those things. But that's just auto, like really like the variety of different like industries that are set to benefit from these models that can just, you know, really pick up anything you want them to do in terms of especially what what we focus on is is dexterity is we really think is like the holy grail and this is I think not even that hot of a take within robotics. Everybody in robotics knows that like dexterity has really been the bottleneck and that's the one where we are just like really focused on pushing is making it so robots can use their hands to do, you know really a massive variety of different applications. So

Speaker 2: is the biggest opportunity taking something that doesn't have a robot in the loop at all and creating a new robot or a system that can, you know, use existing robots to do that task? Or is there still opportunity in, look, I have a KUKA robot arm and 80% of the time it does it right but we have someone there that is taking over and getting on the Xbox controller or something like once or twice a day and that's where we want to bring AI to bear to deal with those edge cases? Or is it more of the wire harnessing like you're not doing that with a KUKA robot arm at all and so we're going to unlock that for the first time?

Speaker 6: It's much more of the latter. Okay. Much much more of the unlock, right? Yep. And I think even from what we've seen so far and just like you know, starting to announce these these models that that we've been announcing and starting to see people that have been coming us Mhmm. Coming to us for what they want them to use them for, like, we just have this like, you know, there's this general sensation of like this explosion of interest of like, oh, I've never thought about using a robot at all Mhmm. For this entire application and now, okay, I see. We we can start to get these robots to very quickly and very reliably and with good speed and all all these other things that are needed to really deploy these things, that this is this is coming online. And, yeah, it's it's really much more of this unlock for all these different things that we haven't really been able to use robots for before.

Speaker 1: It feels like the way that robotics are naturally diffusing, which is, you know, kind of behind the scenes.

Speaker 2: Sorry. Watching that. I didn't see the video of this. This robot slapping at kids so bad.

Speaker 1: So bad. We'll play them after you jump off.

Speaker 2: Can stay on.

Speaker 1: So the way that robotics are diffusing is naturally in industrial settings behind the scenes And it feels like there could be massive amounts of progress being made and relatively little hype around it because the people that are experiencing these products in real time aren't necessarily on x just being like, this model changes everything. I no longer write, you know, I no longer write code. I just prompt, etcetera, etcetera. Is is that feel at all accurate?

Speaker 6: I I I think, you know, what I think about there is that the industrial applications, we think are are are likely to be the ones that really start to ramp even more quickly than more like consumer type or, you know, like home type applications. You know, of of course, there'll be like more and more robots in people's homes, but in terms of these things like really starting to scale, we do think that, you know, industrial is is is more likely to take off there. But we're also like super excited about everything that will be happening in in more like consumer and home type applications. So, you know, that that's, you know, one of the, you know, just fundamental strategies of focusing on the model intelligence and it can be used in in all these different cases is we do think industrial is more likely to take off soon. But, yeah, we're we're super excited about supporting, you know, more consumer and and home type things as well.

Speaker 2: Data, benchmarks, evals, what's the current thinking? The sim to real gap, puppeteering, I mean there's so many different pieces of, but I want to get up to speed on like your philosophy on each of these like trade offs. Like how much do you believe that you know, the scaling laws apply versus just, know, we've seen all these crazy companies of doing like you'll wear a camera to collect training data for humanoid robots like a what's company the data not collection training like free

Speaker 1: house cleaning.

Speaker 2: Yeah. Yeah. What's your data collection thesis? What's your eval benchmarking thesis? Like how do you know you're getting better? How much of it is qualitative versus quantitative? Like take me through that side of the business.

Speaker 6: Totally. These are all, you know, very very good and very core questions.

Speaker 2: Yeah. I know it's like five questions that we could spend an hour on Exactly. Each of

Speaker 6: Tell me everything. You know, we think what I would point to here, right, is like we we've started to to share a fair bit on like what we kind of think is is really core and and what drives like all the decisions that we put into our model. You know, going back to, you know, you mentioned scaling laws. We announced our our gen zero model back in November.

Speaker 8: Yeah.

Speaker 6: And I I think, you know, it it really was the the first time in robotics that anybody had shown, like general scaling laws, right, where we can predictably advance performance with more and more computing data. And of course, this is something that for for all the AI researchers that have known what has been happening in all the other domains, like, is of course something that that we we were expecting to happen. And GenZero back in November, it really was the first time that anybody had showed shown this. That was back in November. You know, fast forward just five months, April pretty recently, we announced the Gen one model. And it's it's quite a bit better model. And and like, you know, the biggest point nice. You're just we've shown some videos here which I I can give some color on. It's it's really quite a

Speaker 2: have a whiteboard and this is a job in our studio. So Tyler, I'm sorry.

Speaker 1: It needs to be erased.

Speaker 2: It actually needs to be erased. Tyler's Everything on the

Speaker 1: board is irrelevant now. See if we

Speaker 6: can get a robot to the studio before too long. But you know, the Gen one model, like, it really is starting to cross into, like, these levels of performance that we think for certain types of applications. And we try, you know, to always under promise and over deliver, so there's plenty that we still have more to go. We feel very early overall in like the general journey Yeah. Of general intelligence for the physical world, but starting to cross over into levels of performance where these things are commercially viable for a good number of applications. And we think that this is really like a crossover point where we have like a general model starting to be able to hit levels of reliability and speed and improvisational intelligence where we can start to get these things out there. Very much like, you know, I think that kind of like you take a g p d two level model, you scale it to a gpt three level model Mhmm. And you start to tick over into certain types of applications become commercially viable. Right? If you remember, gpt three started with like copy.ai and jasper.ai, you know, copywriting for ads. That was kind of the first thing to take off, few others. And we feel like that's starting to be where we're at with these models for the physical world.

Speaker 1: Okay. Question. Is it possible that for certain robotic form factors, let's say humanoids, they get to the point where they can do economically valuable work as in the robot can make something, but the process overall is not commercially viable because of the the the CapEx needed to

Speaker 6: Yeah.

Speaker 1: To to You're

Speaker 2: talking about purchase

Speaker 1: the So like, I've been looking at humanoids do some work and they're able to do some type of process, let's say. And I can say like, okay, maybe right now there's a bunch of humans out there that do that kind of work and you can put a humanoid there. Problem is like a humanoid has all of these different, you know, actuators and motors and batteries and all these different things. And I'm looking at it and thinking, okay, is it possible that the robot sort of just starts like breaking down where, let's say, you spend $50,000 on a robot and it replaces a human, but it starts degrading over time and, you know, many of the parts need to be replaced frequently enough that you're effectively having to just buy replace parts or buy a new robot so frequently that you're better off just having a human in that in that role.

Speaker 2: Well, I I I actually hired a quid manual lawyer to do my dishes. And so like, even if it's $600 an hour, I'm gonna be saving money.

Speaker 6: Nice. Yeah. I I whether it's the things whether it's the things breaking down or or whatever it is, like, yeah, any any factor that makes it so that, you know, like what what you need to get done is not getting done and you need to like have somebody looking over their shoulder the whole time, like, that that that makes it, you know, not really at the level of of viability that that I think is really needed for for these things to really scale and be useful.

Speaker 2: But We're running long. Do you have another minute?

Speaker 6: Yeah. I'm good. Perfect.

Speaker 2: Tier list of data sources. I want you do know tier list s? Yeah. Tier is the best a b c

Speaker 6: I love tier list.

Speaker 2: Want I want to throw some some data sources at you and you can sort of rank them. So No. First one would be YouTube videos or like internet video. Like transfer learning from there's a video of somebody skateboarding or walking around. Is that a valuable data source? Put aside the cost to license it. Just think about the quality.

Speaker 6: Yeah. This this is great. Yeah. With I don't know the whole set yet, so I have

Speaker 2: to Yeah. Yeah. It's tough. This is also live, so it's tough to

Speaker 6: in the whole

Speaker 2: It is live.

Speaker 6: Good. No. No. I'll give you an answer. But it's Okay. It's not s tier.

Speaker 2: Okay.

Speaker 6: It's not A tier. Okay. Maybe I give it a B.

Speaker 2: B. Okay. Next one.

Speaker 6: I don't know what else is coming.

Speaker 2: World models. World models. I've heard a lot of hype about transfer learning from world models. Seems a little bit early but where is that?

Speaker 6: So world models as a, right, your question upfront was as a data source, right?

Speaker 2: Data source specifically. So you can generate infinite worlds. You can generate a world of that whiteboard and use that as training data even though it's synthetic.

Speaker 6: Yeah. I think synthetic data in general in robotics is still a very kind of, you know, open frontier and there's Sure. Not like there's not a huge amount of proof points here in terms of it really enabling. Okay. It's an area I think is promising, but but there hasn't been a ton of proof points here. So I would just like C tier. I would put world world models like not as a source of data.

Speaker 2: Would F to tier. More like Woah. F tier. I'm here. I'm gonna wait. Is Radical Ventures in any in any world model companies? I'm gonna give Rob Taves a call and tell him that you're a talking track. And Rocco's Basil is not gonna like this when Fei Fei Li achieves ASI and has a bone to pick with you.

Speaker 6: It's it's a it's a complicated question. I I think that

Speaker 2: It's okay. We can move on.

Speaker 6: The data that is behind these world models. More like the source of data I would say.

Speaker 2: Okay. Third data source, mocap data. You got a bunch of people in mocap suits. You got them doing things. It's a three d representation, maybe a point cloud, something like that.

Speaker 6: This is a list of questions. So for me it's not a data source I'm super excited about

Speaker 2: Another F tier?

Speaker 6: No. Well, let me explain. We we we, you know, we really care especially about dexterity. Right? Okay. And mocap suits of people running around, you know Yeah. Studio is not the main place where you get dexterity. It is good for like, you know, full body, you know, whole body motions.

Speaker 1: Well, that's your your first issue

Speaker 6: is your carry on.

Speaker 7: In my

Speaker 6: own list, I'd put it C tier. C

Speaker 2: tier. Tier. Next. I get a lot of Instagram reels from this community. It's the gloving community. I don't know if you've heard of the glovers. I don't know this Okay. One So they put up LEDs on the end of their fingers. They wear gloves and they do light shows for each other. It's like a burning man Coachella type thing. Anyway Okay. They exemplify remarkable dexterity. Is that going to be an important source of data?

Speaker 6: We love remarkable dexterity data. Okay. Don't know how much of this this source of data Okay. It depends on what what type of data, not just like what people are doing but what type of data can you extract when people are doing it. So I don't know what sensors the gloves have but Yeah. In general the more dexterous the data the more valuable.

Speaker 2: So without

Speaker 6: knowing more about it maybe I'll put it b or a.

Speaker 1: John will start a gloving data labeling

Speaker 2: Next campaign one, for just the general like open crawl internet data. I talked to a very thought provoking AI thinker at one point and he said that like like with enough scale you could like learn to walk as a humanoid robot just By from reading

Speaker 1: the way, a pretty heavy hitter robotics founder texted me live and says, Pete is trying to figure out how not how to not say world models suck. And then he says, there

Speaker 6: it is.

Speaker 2: What is like the

Speaker 7: world model?

Speaker 6: We wrote a blog post on this.

Speaker 7: We think they

Speaker 2: have back real in five years and it'll be great. But But specifically just this idea that like to start your foundation model, to do like the earliest pre training, it might be helpful to just start with like general understanding of the world so you bake in like all the Reddit data or all the Internet common crawl, like all of that stuff. Where does that rank on the tier list?

Speaker 6: So here let me say on this, right? So like exactly this like, you know, idea of like, oh, let's use all the data from the Internet we possibly can and bake that into the robot brain and then that's a source of knowledge that the robot brain has and then we also have the robot learn all the other things. Yeah. That is exactly, like a core that that was like really the core of my work back when I was at at Google DeepMind. Like like and, you know, a lot of the research in there was like things that myself and my co founder Andy Zeng, you know, that that that's what we did.

Speaker 2: I'm here next I'm very passionate

Speaker 6: about it.

Speaker 2: Well Talk me off s tier.

Speaker 6: Not s tier.

Speaker 2: Oh, okay.

Speaker 6: It's it's great. It's it's very it's very helpful. I I would like it it's it's what it's it's something you definitely want. Right? But then Yeah. I I would just think about it this way. Like, what's the best way to learn how to ski? Is it read a book on skiing?

Speaker 1: No. It's to read r

Speaker 6: skiing. Oh my god. It's to read r slash skiing. That's right.

Speaker 1: No. No.

Speaker 6: It's to watch a bunch of

Speaker 2: Instagram hype reels added into SDKid. That's what you got to watch. You got to get the video training data, not the r slash ski. R slash skiing. I think I had another one. I think I had another one. Okay. Okay. The last The point

Speaker 6: is you want to go skiing, right?

Speaker 2: Okay. That's the point. The last one is simulation. I gave you, let's call it, Unreal Engine and I have a one to one representation of a particular robot and I can vary the terrain and I can vary the motions and it can sort of goal seek over inverse kinematics model and you can use that to train off of. Where does that fit in?

Speaker 6: I'd say C.

Speaker 2: C tier. Okay. Is there anything that's s tier or a tier? Like what way am I missing something or is this the secret sauce? Is this why I got to give you $400,000,000? I

Speaker 6: mean you can take a look at some of the data that we have. We've shared a little bit about what we Okay. You know, I I certainly think that our our data is is very very good. It's Yeah. But it's the the things that create s tier data is not just like, oh, what's the overall like, you know, data capture methodology. Sure. But it's also about like what what you know, what are people actually doing? What's the quality of the data? That's the type of thing that's like very hard to, you know, short conversation to put your finger on. But as you sort of like live and experience this stuff all the time, you really, you you you know, you develop this appreciation for what really drives quality. And this is one of the things that really drives, you know, like what makes the best language models the best language models. Quality of data is an incredibly important part lived of s tier. Lived experience of of the the physical physical world is

Speaker 1: s Wearing a

Speaker 6: Lived experience experience of of the physical world is s tier. That is exactly what it is.

Speaker 2: I love it. That's a great that's a great philosophy. Well, we gotta hit the gong. $400,000,000, $2,000,000,000 valuation. Thank you so much for coming on the show.

Speaker 1: Great stuff. Hang on.

Speaker 2: This is great stuff.

Speaker 1: Let's do it again soon.

Speaker 2: We'll talk to soon.

Speaker 1: Appreciate the live, live, like, the proof of work background too. Robots have been back cooking.

Speaker 2: They're cooking.

Speaker 1: They're cooking.

Speaker 6: It's just every day here. Well, thanks for having me on, you guys.

Speaker 2: Thanks so much. Talk to you soon. Cheers. Goodbye. Let me tell you about Railway.

← Back to story