Arcee AI launches 400B-parameter sovereign US language model as open-source alternatives to Chinese labs

Jan 27, 2026 · Full transcript · This transcript is auto-generated and may contain errors.

Featuring Lucas Atkins

Let me tell you about Railway. Railway simplifies software deployment. Web app, servers, and databases run in one place with scaling, monitoring, and security built in. Uh we have Lucas Atkins from ARAI coming into the TBP Ultradome.

Lucas, how are you doing?

Here he is.

I'm good. Thank you for having me.

Thanks for hopping on the show. Uh first time on the show, can you please give us an introduction on yourself and the company?

Yeah. Yeah. My name is Lucas Atkins. I'm the CTO uh at RCAI. We actually changed our logo

so that it like highlights the

I have text. I'm sorry.

No, no, no. You're good. Okay. Um

and uh we've uh you know, for a very long time been a startup focused on like enterprise Yeah.

Um you know, custom language models. Uh and we decided to jump into uh the pre-training game ourselves.

Pre-training. Okay. So, small language models. I like small language models. the large ones that scare me. But uh tell me how small are these? How much? What is the training budget? Can you give me like a GPT2 level training run? GPT3 level training run like trying ground. Am I looking at the data center from space or is it something a little cluster in a closet? Can you train it on a Mac Mini? What are we talking about?

It used to be you know that a small any small model but a small language model right was you know tens of millions of parameters. Now it's in the realm of I would consider

anything under 50 billion kind of resting in that smaller world. But

small is big as big has gotten enormous.

Yeah, exactly. And it's it's only going to get more and more as you know RAM well I don't know but as people get more RAM and as we able to fit more on our machines but

um we historically lived in that kind of 50 sub range and then we also

we we like trained on top of other open source models like llama

um

is that fine-tuning or distilling is there a difference there that's meaningful to what you do

there's a an open debate as to whether fine-tuning from like other models just like text outputs is distilling

distilling. Okay.

Um or if you actually have to like grab it from the you know the logits but

I would argue that it's all distilling. It's all some form of making a model better by using another model to make it better, you know, that kind.

And is there is there a sort of power law distribution in applications of SLMs? Like is there is it like you know translation is just like everyone's using that or or or text extraction OCR like is is there sort of a power law of like what it can what what the when you reach for this tool in particular? It used to be uh up until about a year ago it was you were you were pretty confined to

you could you could hill climb a specific domain

you know obviously it ranges based on how big your you know model was or small rather

but um after this kind of like you know reinforcement learning revolution of the last year um if you can build a task and and an environment to to like have that model live and and and breathe and learn inside of. You can hill climb you you can take like there's people that are taking hundred million parimeter models and making them perform like you know five or six billion parameter models from a couple years ago. So um it used to be a lot harder. Now it's it's all about it's a lot more taste

and task driven.

Yeah. And when you say perform like it feels it feels like the true true cutting edge frontier model sort of smokes all the domain specific models but it's it's expensive and so uh how important is cost? What sort of cost reductions are you seeing on the inference side? Uh once a company or you know a developer decides to move from the heavyweight frontier model to something smaller,

it's it it it is very often when they're going into production, right? Like it's it's like they've done prototyping with, you know, a large closed source model.

Um they've had success with it. They've gotten approval. they want to bring it into production, you know, to more consumers or or their customers and then the bill starts to hit and uh the market start to look tricky. So that's when they start looking for ways that they can host it themselves. There's also like the compliance and data privacy aspect of it. Uh and that's

something that you know kind of sovereign you know US-based open models have stopped

kind of being released. uh it's it's very much out of China. Um you have you know a few uh in Europe but uh it is very much been lacking in the United States. So um we were I was hoping to to release this before I got on the show but uh you know today we're we're finally releasing our f 400 billion parameter model. So we went up pretty big.

There we go.

Congratulations.

You still got you still got plenty of day left. That's great.

Still got plenty of day left.

You're like, we're still

This is just this just a preview.

So, uh, yeah. T take us through the landscape of, uh, who you're competing with? Set the table for us. Most people will be familiar with Deep Sea, Quen. Um, like how are these companies resourced? Is this just China's national interest that they're funding these? Do they have good business flywheels at this point? Uh, because I imagine you have a plan to sort of unseat them. uh in in the open- source race

they're you know there's a lot of rumors about how all of you know how all of this capital gets allocated into you know uh these different companies but at the end of the day

rumors hedge funds

well there's rumors that you know the government's giving it to them they have favorites you know there's shady deals going on but

uh the reality of it is

they're saying they didn't train it train the models with scraps in a cave

in a cave No,

no. Uh, well, actually, frankly, if you look at kind of the horsepower that we're dealing with, a lot of them, you know, seemingly did. It's very impressive. And that's that's

that's kind of the, you know, you want there to be this kind of battle of like, you know, the US versus China in this race. That obviously plays into our incentives. Um,

however, the these are extremely talented labs. And so when it's I'm not competing against you know China or or other countries for like openweight you know sovereign language models that that people can own and deploy themselves. I'm competing against like the other researchers there like that you know that human being and and

um you have some extreme talent. So, uh, yes, it used to be really hard to make money making open source models. As they got bigger and as they got harder for other people to host themselves, it, you know, the the the the systems and the economics started to work out because they get the, you know, uh, the community buy in and excitement of something being open source, but if it's a trillion parameters, you know, your average consumer is not going to be able to run that on their toaster, right? So, um, it ends up you kind of get this win-win situation. And now that that has kind of started to flywheel, they've started to, you know, do the Open AI and the Anthropic and the Google Play where they're building products around it. Um, and and it's really becoming a very lucrative ecosystem.

Yeah. Uh, how are you scaling the business? What have you chosen in uh sort of uh the business model world? there's there's so many ways that you can build around open-source technologies. Uh what's working and what are you thinking about the road map long term?

There's we were in kind of a uniquely a unique a very fortunate position when it came to this. Um we had always been customizing models for people. We had always been working for uh you know enterprises, developers, businesses. How can we you know make sure that they're getting the best out of their own hardware? The only thing that really changed was that that when customers wanted something to be based in the United States or from the United States, there stopped being competitive options.

Sure.

And we never really liked the idea that like the foundation of our business relied on other people releasing models.

Yeah.

Um so it was like 6 months ago that we we kind of fully decided, hey,

you know what, let's try it. And and we had success with our first model and then our second, our third, and it it kind of um you know, brought us to this. So the economics of our business doesn't change. so much. You know, we're still working with customers. We're still delivering um you know, models and and and powerful tools,

though we have a much uh more robust control of the stack, which means that the degree to which we can customize and and and the uh kind of period in training that we can go back to for those customers just kind of uh increases drastically. Yeah, just when I when I think of like open-source focused company, I think uh you know like donations, people just like contributing to the game. Uh I also think about like consulting uh actually going in and a company paying you to do a custom implementation work improve. Uh I also think about uh hosted services, consumptionbased plans, subscriptionbased plans. Um, how are you thinking that side of the business model will evolve or where is it at now?

I would love for us to have like a like a GoFundMe or something, but no, we uh

we it's it's very much the the latter two. I think that service isn't insulting obviously, you know, if if the if it's the right customer for the right product and and the right situation,

you know, there there's worlds where where having a portion of your business be devoted, that's important. But the real money maker and I think this has been seen out of those labs in China, out of out of others is it's in the tooling. It's it's like what are you building around it? Um how are you making it so that other people can customize it without you having to do services? Uh how can you make it an easy API? How can you, you know, build a a developer and education kind of suite around your software, you know, that is uniquely benefited by your moni um so that they can take it and and using maybe hardware GPUs that you own, how can they go and train their own models. So there's there's many ways to you know turn this into something that is that is much more like automated and and less servicesbased but uh at the end of the day it all just comes down to getting the you know the customer the right product.

Yeah. Who are your heroes in the open source space? Are you a Red Hat guy, GitLab guy? What do you like?

Yeah. I mean you know obviously Yeah. Yeah. Red Hat GitLab. I mean I I very much jumped into the open source space like in open-source AI. Sure. So, so like as much as I guess they're now a huge competitor of ours, you know, Mistl is huge. You know, the OG Llama team, you know, Deep Seek Quinn, like these very people that they inspired me to try to compete with them. So, it's kind of this very uh

inspired you to grind harder and we appreciate. Yeah. Yeah.

Hit that America, hit that bald eagle sound. Thank you so much for coming on the show to break it down.

Congratulations on all the progress and congratulations.

We'll be looking out when what's the timeline here? You think you'll get you?

It was supposed to be earlier. We got a some GPUs didn't want to load up. So, it'll be here.

We're going to be live for a couple more hours. Ping us on the chat when it's officially launched.

Thank you so much for hopping on the stream. We'll talk to you soon.

Goodbye.

Let me tell you about Sentry. I need to find Century. There we go. Century shows developers what's broken and helps them fix it fast. That's why 150,000 organizations use it to keep their apps working. Uh, we got to talk about this plant daddy.

Plant daddy exposing his local coffee shop. He says, "A friendly reminder that my local coffee shop is running a bot farm to boost engagement. I'm a very small business.

Imagine at a country scale. If you're still responding, reposting anonymous accounts with zero betting, you're being used like the idiot you are."

This is crazy. Yeah. So, there's this video and this this uh local coffee shop, his favorite local coffee shop has a bot farm set up in their back room to boost Instagram engagement specifically. I

does this person not know that that you can just

Also, why did he let Kevin back there to video this and like opsack at this at this coffee shop? It's atrocious. If you're doing Look at the fan at the end. If you scroll to the end of this video, you can see that there's just a fan blowing on all the phones. You see this fan right there? That is crazy.

This seems like such an insane thing to spend money or time.

That's what I was about to say. What is the ROI here? So, they get an extra 50, maybe a hundred likes and comments on every post. Um, and is that enough to drive, you know, this looks like thousands of dollars of equipment. They're selling more coffee because

it's a lot of iPhones.

It is. I think they're probably Android because they need to be USBC in controlled so it's a little bit more open.

Seems like a total total waste of time. I I have never gone to any of my local coffee shops

because of Instagram

Instagram account

that you know of this advertising doesn't work on me advertising doesn't work on

coffee but the engagement here

it's too low

I don't know I don't know well I I would love to know what that proprietor what that store owner thinks about the ROI

I've noticed on some posts recently yeah

that are kind of outside of the the the sort of tech bubble

I'll scroll through and look at the comments And there and and every single comment is AI. Like just straight

up. Yeah. Not copy pasted.

Not copy pasted, but clearly it's like