Prime Intellect's Intellect-3 proves a 100B-parameter open RL model can match closed-source giants — and every app will need to do this

Dec 1, 2025 · Full transcript · This transcript is auto-generated and may contain errors.

Join millions who use Julius to connect their data, ask questions, and get insights in seconds. We have Vincent from Prime Intellect in the Reream waiting room. How are you doing? Great to see you. It's been too long since last weekend.

Thanks for having me.

Congratulations. uh master of really figure finding the one day that we're not live to launch your new news. Uh tell us what happened on Wednesday. The one day that we were off of streaming.

Yes. Um so excited to give you a rundown. So basically for for the broader context with prime kind of like our broader goals really creating um open frontier models um and infrastructure for everyone to create them and um last week we released intellect 3 which is basically really like a scale up um towards um scaling RL and post training and creating like a sort model um especially for like more agentic tasks. Um so basically what we did is we took GLM and and did a whole SFT stage and RL stage to create kind of like a state-of-the-art um 100 billion parameter um MOE model and really kind of like that whole infrastructure um is is kind of quite a challenge like from um like the RL environments to the broader like code sandboxes and the whole stack to do post training.

That's basically what we built over the last half year. I think Will Brown came on the show to unpack some of it on the verifiers and environment side. Um so basically that's kind of like what we released last week and really proved that kind of like we got performance um at 100 billion scale that um thus far in the open source only 300 to 600 billion parameter models like deepse for example achieved before. So basically getting to better performance actually at a much smaller scale. Um and I think in general it showcases that like um open models are starting to catch up. Obviously I think quite interesting um is in in general seeing the trend that um not just with our model but also more broadly with other releases like deepseek today um and over the weekend that actually um they're also on par with like the closed models now. And I think really our goal is so was almost like a a preview release but already sort of is um we basically released like a early checkpoint and we're actually scaling it much further um also on more like a genetic capabilities but basically really like making it um sort of across like a range of task and really I think the foundation of this which is quite interesting is that we created this environment hub where anyone in the world can create one of these RL environments which we ultimately then included in a training run. So basically um different people in in the open source contributed actually to the RL environments that we trained on for those for this model.

So yeah give me a concrete example of like this shift of businesses that need to you know buy a model that has been trained in a specific RL environment. You know we've heard the example of like uh someone's creating a clone of Door Dash and they're figuring out how to do Door Dash orders agentically. Uh what else are you seeing? What are some other good examples of when a business uh would pull this off the shelf from all the different opportunities from all the different APIs that are out there and create something I guess semicustom for a specific business use case? Like what are you seeing out there?

Yeah. So I think what's interesting is like there's I think two buckets basically. There's a bunch of these people like creating our environments for the labs like the Doish clones etc. So basically to push really capabilities. So I think we're in this paradigm right now obviously where ultimately like scaling RL is the main way on on how these models improve right like we've seen it with OPOS or um with GBD5 and Gemini like there was mainly like I think a scale up in RL

um but basically what we are seeing are two things is like on the one side is like there's a lot of demand for these RL environments but then the other side RL is very sample efficient so you can take an open model and um and and then really create an RL environment for the specific use case you care about and scale capabilities for that. So I think good example of this was for example cursor with composer like that was like what's what's widely believe known it's like to be a scale up of an open source model and RL environment was cursor like like they basically just gave it like the the tools and and the things within the harness and application of cursor itself. Yeah. So they trained basically that model um and like really on on getting really good at using cursor and I think we'll see the same play out like across all the applications where basically the the broader theory is um like every application every company will be an AI company or AI native and will um have an opportunity to really um post- train and use RL to make the models work specifically on on their application. So even if you take examples of like say a Figma right like if they want to uh make make their platform agent take really they need to create an RL environment around Figma and post train on on that environment to be able to serve that within Figma like kind of like out of the box like the closed models won't be perfect at like really navigating and and making those applications agentic. So I think that like that's the broader theory. I think really it's also it's like it's so like the the capital requirements are are much much lower than I think the big labs want to believe you like in essential it's like you can for like hundreds of thousands of dollars like post train a model right it's like to to be much better on your application and then also to you are able to like the model

that's one weird trick post train a model for 100k and create a better so I mean that that's basically what you're saying is that is that if I'm Figma as an example and I could use a frontier model that's really expensive expensive and beefy and it knows everything about it knows some stuff about Figma but it also knows about the Roman Empire. uh I can go and RL on just my particular application and have a smaller model that's fine-tuned on open source you know uh open source model and and get better performance than with the big beefy you know do everything omni model is that right

exactly and I think really you get better performance but also at a lower price point potentially right because you can really specialize the model to be extremely good for your use case so I think you could see this with like cognition posting their own model with like um cursor postering their own model composer and composer is also it's like it's much cheaper to serve it's much faster like same for for the model cognition was building so I think what we're seeing and and we we've started to work with like dozens of customers on like helping them basically do post training and RL yeah

I think um

we're basically starting to see a huge pull in terms of like enterprises realizing that like if they want to get a specific capability um RL is the way to get it and and ultimately enables them quite capital efficiently like to to train those models and surf those models and then really get to like a point where even in deployment like all the interactions from the user help improve the model. So I think with the cursor example like every for example cursor tap interaction every yes and no that the user gives to the model is updating the model every two hours. So it's what like talks a lot about like online RL

yeah like like they're basically retra like continuously training the model in two-hour interval and pushing updates every two hours to first of app. So basically every user using purser for the last two hours is um is being post trained on so to speak like with kind of like an online RL loop and I think that's something which we'll see more and more that basically applications will do their own RL their own post training

um actually then and that's like really how we unhubble basically towards AGI where it's like the question is like why haven't we say automated um

like specific valuable knowledge work yet and I think the answer that also like Shelter was speaking about for example on on the example of like autom automating text and accounting for number right it's like no one has really created RL environments post trained on them and then serve the model in the application where the end user is and then ultimately the the end user's interaction with the agent can improve the model further right um so I think that's really the paradigm that we see play out which I think is really a paradigm of like thousands of models or like millions of models that like basically continuously improve and where actually the applications uh win to some extent through distribution like ultimately they own the the end customer um interaction, right? Where it's like even the cursors and and um cognitions have like an advantage there over folks who basically just model providers and who don't interact with like millions of developers and I think we'll see the same play out like across all the different applications. Um and it's something like for example s talk about also in the context even of like copilot and Microsoft right like they own distribution they can like create the cursor for excel for like powerpoint or other things right and then pull strain on all those interactions so I think we'll see this like play out I think across like all the different verticals and I think it's like a border trend of just like every company needs to become AI native right like

and own also the keep owning the distribution like they don't want to give all of it up to the to the big HR labs.

Yep. That makes sense.

Makes sense. We got a question from uh our intern Tyler. If we can shoot over there.

Um yeah, I guess I I I saw you guys talk about this a little bit online. Um but is there any like point of you guys uh training your own base model? Yeah. So basically I think one interesting release in this context was like um today we actually released uh like we supported RCI in in their base model release uh which is like kind of like catching up to um the the Chinese base models. So basically we we supported them in training um a small base model which achieves like pretty sort of results. So we released that I think like an hour ago with them and uh and we're actually now like ramping up with them to towards like a much bigger uh base model. Um so fully like pre-trained from scratch. So we actually just have like 2,000 B300s going live I think yesterday uh to ramp up like towards like a much bigger

um pre-train. And I think like it's really I think like the broader pattern is like since kind of like Llama had some reorgs and changes and misrel became sort of like a for deployed European enterprise play or something. I think there's really no one left outside of China right now to go end to end in the model stack. I think um others like reflection I think are trying to um also pick that up but I think there's very few players I think outside of China. So I think that's our broader goal is really is like serving um like the the world more global um globally but also like the west and the US um with like an end toend pipeline right it's like from data to pre-training to m training to post- trainining like the full stack and making that accessible to like enterprises and people who are like training their own models so I think like there's a huge I think pull where a lot of enterprises or even like like sovereign like nation states etc like they can't train on Chinese open models but they also they can't rely on closed models um so I think there's a huge gap in the market right now that we're trying to fill of of really like serving kind of like that uh whole segment.

Do you have anything else, Jordy?

No, this is great.

Um I I want to know one last question about you know what what will the market structure look like in uh maybe a year or two around like implementing these RL environments for companies because when I when I see you know you say every every company is an AI company. I I believe that's somewhat true. Uh and I believe every tech company, maybe every founder-led tech company under 10 years might be able to say, "Okay, yes, we're going to go and train fine-tune a model and bu turn our our application into an RL environment." But if I'm, you know, the Coca-Cola company, uh, you know, I might not be at that level of like going and building RL environments for every business process. I'm probably more of a buyer of this AI as SAS almost. Uh, so how do you see that kind of breaking out? How do you see like a truly legacy, you know, non- tech company adopting a fine-tuned LLM or an RL or an RLED model?

Totally. No, I I think there's like early adopters and later like later adopters. I think Coca-Cola might be more like the late adopter and might not need to adopt it early on, but I think they are adopting it just like in less obvious places, right? It's like ultimately I think they're initially just like using the AI tools that use us for example right in a sense where it's like say customer service right like is a like perfect example of like where you get a lot of gains out of post training and then like they might like like basically the AI native customer service platforms might use us to post train using Coca-Cola data.

Sure.

Um to serve them a better like model. So I think what we'll see play out I think is really um just like making like a lot of that like so accessible to your point that it feels more like using SAS

where I think like one element of it is like we are like launching also like our whole like like RFT platform basically and and offering to make it extremely like easy and plug and play but then there's also like a for deployed element right where you can outsource a lot of that stuff to our team and I think the other element is like really like we're walking the walk in in terms of like making our own thing kind of like agentic and autonomous that you could basically just use like an autonomous AI researcher to do all of it for you, right? Like that you basically just like plug it into your system and like the AI even like creates AI for you. Yeah. Like like and I think like like I think that's the next paradigm is really making a like making in general training models like fine-tuning models, post training models like as accessible as VIP coding is today, right? In a sense, it's like I think with VIP coding, like literally every human on Earth is able now to like code some stuff up and I think we'll see the same play out with AI over the next 12 months and that's one of the big things that we're playing into. We're kind of like pushing towards like autonomous AI research where AI can do most of it for you.

Well, thank you so much for taking the time to come and talk to us on the show. Congratulations on all the project

and we will talk to you soon.

Great to see you, Vincent.

Goodbye.

See you guys. Have a good one.

Let me tell you about Privy. Privy makes it easy to build on crypto rail. Securely spin up white label wallets, sign transactions, and integrate onchain infrastructure all through one simple API. And I'm also going to tell you about adquick.com. Out ofome advertising made easy and measurable. Plan, buy, and measure out of home with precision. Our

← Back to story