SemiAnalysis on Meta's Project Prometheus: 1GW Ohio cluster set to surpass Stargate, Llama 4 called a failure

Jul 17, 2025 · Full transcript · This transcript is auto-generated and may contain errors.

Featuring Jeremie Ontiveros

analysis hopefully in the studio here talking about meta super intelligence. Jeremy, how you doing? Good to meet you. What's going on? Doing fine. Hey guys, nice to meet you. How are you? I'm good. Um could you off with like an introduction of how you got into this?

I've heard Dylan Patel's story of just kind of being on forums and nerding out about this stuff and then turning it into a career. But how did you get into uh semiconductor analysis? Yeah, sure. So before joining semi analysis, I was a buy side analyst.

Uh so I was mostly focused on the stock market on equities uh looking specifically at tech stocks. Um I was at long only European shops looking at European stocks like ST Microeleronics, Infinion, all these industrial automotive semiconductors. Uh as part of my research I discovered some analysis.

I was like whoa this guy's really good. Um and one day I just saw a post of Dylan on Twitter. Um he was just saying yeah I'm hiring a bunch of people with sellside or buy side experience because I want to build a real institutional f firm. Um, and yeah, like we just got a chat that was I think August 2023.

Um, I joined the company in February 24. We were seven when I joined. Uh, no, I think we're 33 or 44. So, uh, yeah, it's growing every day, I think. So, I can't even keep Are there other firms like this in the in the buy side sellside ecosystem?

I'm thinking of like Ian Bremer has the uh has has a consulting group where he writes books but then also has a team of geopolitical analysts talking about kind of global macro trends and what's going on on the political side.

Did you ever interface with any other firms like semi analysis because it seems like unique and Dylan's like AI is so big that Dylan's crossed over um from talking specifically to uh to sellside analysts that even venture capitalists will read semi analysis. No, I I agree. I think it's pretty unique.

Uh look like there are obviously on one end there are market research firms. Uh there's like yeah folks like IDC Gartner and so on and so forth. There are many of them. Um on the other end of the spectrum so those would be like very industry focused.

On the other end you have uh yeah Wall Street salesside analysts the Morgan Stanley Goldman Sachs of the world. I guess we kind of found a spot in between.

Uh and also it's um it's part of the team because uh we have over 20 analysts and roughly speaking you have half of people that have uh financial background like myself. So more closer to markets and the other half is more engineers and people that are extremely technical.

Um and so we kind of found that sweet spot and also that's what's pretty cool is when you look at the different people analysis like all the guys like myself that have more financial background actually love geeking out and digging into technology stuff. Uh and same goes on with the engineers.

they also want to understand the business aspect and they end up geeking on on the data and understand the insights and market shares and stuff like that. So yeah, we all go towards the same goal but have like a different set of skills and such. That makes tons of sense. So take me through the latest piece.

I can imagine other other players in in the semiarket research space just wait for you guys to publish and go, okay, I think now we're ready to form an opinion. Now we can put a buy rating on it. Now we can slap a buy rating on that bad boy. your your favorite researchers favorite researcher. It's okay.

By the time we're done, we're going to get the venture capitalist plan solidified. It's a million dollars a month for any venture capitalist. And we like to say a platform fund. Yeah. If you're not if you're not on the the semi analysis, can't really be an AI investor. Yeah.

I got to say we found something super funny a few days ago. One of the big brokers wrote a note to clients and basically said, "Yeah, these guys are the Bible. " Uh actually, they talked about fabricated knowledge, which is Doc, so president of our firm.

They said some people think fabricated knowledge is the Bible, but actually he works with the firm that's the actual Bible and all the people. I don't know. I didn't say it. That's a Wall Street guy. That's amazing. There we go. Yeah.

I I I I I'm a I'm a strong believer and I I really enjoy the pieces every time they drop. Uh take me through the latest one. Uh meta super this week uh was almost drowned out by the the Windsurf Google cognition debacle. So, Zuck came out with this huge announcement, you know, Monday.

Um, and, uh, yeah, it almost got swept away a little bit. Yep. Yeah. I mean, uh, I guess Zach is all in, right? That's another proof. Um, it's interesting because I think some people were at some point doubting at the beginning of the year, is he going to carry all that investment, tens of billions of dollars?

The answer is clearly yes, he wants to do it. Um it's interesting because when you look at meta capex so far it has been heavily tilted towards what he calls core AI which is basically recommendation models. It's a lot of inference of inferencing advertising models and all of that.

So they said publicly in 23 and 24 and 25 most of that capex which is going to be like $70 billion in 2025 is for that core AI business. Um so the genai the llama stuff is still uh is still at an earlier stage. Uh but the big question is how how how bold is he going to go with Lama.

And I think what what we showed in this article, hey, there's a lot of uh evidence that he's doing it on a very large scale. Um and that's not just it's not just empty statements and saying I'm going to build a 5 gawatt data center in a few years. It's actually things that are already built or under construction.

It's already committed capital. Um and so that's the first story is there's there's already a substantial infrastructure build out specifically for Genai.

Um, and I guess that sort of rationalizes uh the amount of money that's spent on researchers because if you think about it, yeah, you're spending like, hey, maybe $30 billion this year on lama infrastructure. Uh, you're going to spend maybe, I don't know, $3 billion, $5 billion on on hiring top researchers. Yeah, sure.

Why not, right? Yeah. Yeah. That's a completely reasonable strategy. And and it always it always mathed out to us that even if even if these crazy researchers wind up working on core AI and the llama project doesn't even go anywhere. It's like you could probably squeeze $3 billion out of core AI, right? I don't know.

That's just like the thought I I had.

Or you or you can also like these researchers can be focused on on Gen AI, but they're going to develop like state-of-the-art technologies and using like massive compute and then those state-of-the-art technologies can feed into the core business and end up generating more advertising sales.

And if you think about it, like Meta is growing double digits, $160 billion. Yes. So every time they grow double digits, we're talking about close to 20 billion dollars of incremental revenue. So it's big numbers, right? It's easy to justify just getting a few billion dollars on researchers. Yeah. Yeah. It's great.

Um I have uh I have one question uh about the history here with Meta.

So there was this story, I think it came from the first uh Mark Zuckerberg interview with Darkesh Patel where he tells this story of the original llama data center or the reason he has residual capacity was that he felt like he got caught flatfooted around Tik Tok and reels and recommendation algorithms at scale for vertical social video where it's much more it's much less driven by a social graph and kind of like a traditional CPUbased graph query and more about um these actual recommendation algorithms.

And the way he tells the story is like uh we didn't want to get caught flat-footed. We needed to play catch-up in reels, but we didn't get caught want we didn't want to get caught flat-footed again.

So I told the team build two big data centers and then we had this kind of empty data center sitting there and that was that gave us the initial compute for Llama. Does is that too much of a simplification? Does that feel like what happened?

Do you have any insight into kind of like how the llama project the initial compute was built out and then I want to go into the future of the project.

I would just say at a high level if you think about uh the amount of money that Meta has historically been spending on on data centers relative to what they maybe need in theory like they have always been overspending.

um they have always been yeah invested substantially in infrastructure and same goes on for Google I think Google to an even even bigger extent but yes like Meta has been a pioneer in building large scale data centers over a decade ago Meta introduced their age shaped data center design they've been building 150 megawatt campuses since like 2013 um so that's not new to them like building large scale and that's why they also already had this uh sizable compute footprint uh But in 2023 like sizable was maybe 20,000 GPUs.

That was pretty big.

But today it's uh yeah it's updated right you want to be in the hundreds of thousands and as of today uh meta is to some extent late uh in terms of training compute relative to others again because they allocated they have invested a lot of money but a lot of that has been allocated to the core AI business.

Uh but what we've shown in that article is that they're actually today ready to ramp uh yeah a massive data center in Ohio that's going to get them uh talk about the H shape of the data center. Why was it an H shape to begin with and then it sounded like they abandoned it in favor of just one big tent.

Uh what are the benefits of a tent? I want to know about data center shape broadly. Yeah, sounds good. Uh who doesn't love a good data center shape? Look, the H uh I don't know why it's an H. Uh what's more interesting is the structure of the building. Okay. Uh if you look at one physically, it's absolutely massive.

Um it's close to a million square feet. Uh it's just a monster. If you look at the structure is also three levels. Uh so it's a very complex structure. Um it generally what we've observed for satellite imagery is that it takes roughly two years to build.

Uh which and I'm just talking about like first stone to actually getting the project built. So two years is a lot. Many people do that in a year or less. Um so yeah substantial time to build. Um it was designed for very high efficiency. Um so they've been using a system with free air cooling.

Like they can get the air from the outside. Uh there's no air to water heat exchange. It's basically you get cold air from the outside. You just expel it hot air. Uh super efficient. You can spray some water on it to make it even more efficient. Uh the energy efficiency ratio is typically called the PUE.

Uh maybe you've heard of that, maybe not. uh industry average is going to be 1. 3 1. 4 for which means that for every watt you allocate to servers you have to spend another 30% or 40% of that power into cooling and power distribution losses and all of that uh that ratio for meta was historically below 1.

1 uh so they were actually they were the most energy efficient firm in the world running data centers interesting but the tradeoff is that these data centers took a long time to build and had a very low power density got it and so what happened is first of all Um in um at at the end of 2022 uh Meta introduced the first massive design change.

Uh they completely through the old age and build a let's call it a more traditional data center design single story sort of a big rectangle. Uh faster to build maybe one year maybe one or 15 quarters something like that much faster to build more denser better suited for AI. It could handle liquid cooling.

Um but what Meta thought I look I think this is where the XAI story uh actually takes a big uh a big role. Um I I think what Elon demonstrated when he set up that cluster in 122 days. Um he just shocked basically data center infrastructure leaders all around the world.

Uh my understanding made it he made everybody's life a lot harder because before it was like well if we do this really quickly I'm going to need a year year and a half and suddenly it's like well there's a new standard do it in 120 days. Yeah.

Uh and imagine like you're the infrastructural leader uh at the hypers skater that you you have experience developing gigawatts of capacity um and you think you're the best in the world at doing that and suddenly some guy comes out of nowhere and does it in like one quarter of the time.

So I think many people were really shocked uh and I guess Zach like took it the other way and just uh was inspired by it and basically that's where the tent steps in which is let's make uh the data center in the shape that I can build the fastest um so that yeah the only bottleneck just finding some power and that's it and buying cheap saying earlier is a bull market intense Zach needs tents the XAI engineers are intense in the office and then the XI my servers will be intense too.

Everything's temporary. I mean, is there is there something about the tense structure that's potentially like, okay, this is this is going to deteriorate faster, like what are the what are the drawbacks of moving faster?

And then on the PUE question, when I hear a one gigawatt uh or 5 gawatts that they're targeting, is that total power into the building or total power to the actual servers?

Um honestly people generally throw uh like you there are both uh in this case based on our analysis it's uh to the servers so it's actually going to be slightly more in terms of gross utility power often times you see people quoting uh total power just to have a bigger number industry industry standard practices to quote IT power anyway in this case you're going to have one gawatt of compute power uh by the end of 2026 in Ohio um and then close to 2 gawatts by the end of 2027 in Iana that's compute power to the servers.

So anyway, massive compute for for LMA. Is that a vanity metric? Is it a vanity metric to hit one one gawatt? Exactly. I mean Stargate you have on your chart at 880 megawws. Is 120 or 140 megawatts really going to be the difference between like an amazing super intelligence and the next best thing?

Like it feels like we had to cross this threshold and one gigawatt is like it's a good headline. Yeah, it's a good headline. Uh that's more of it. Like it's not going to change much if you have 900 or a gigawatt, but the more the better, right? Like you still want to have more service. Of course. Yeah. More is better.

It it does matter to be clear. It's not going to change too much, but it does matter. Sure. Yeah. It probably matters for recruiting, too. It's like you're going to be at the place with the best. What does the best mean? I can I can, you know, hundred million dollar offers, nice round number.

One gigawatt factory or AI, you know, super cluster. That's a nice round number. It's great. Um, look, I've got a better one for you. A hundred billion dollars, right? $500 billion target.

That's something that's pretty funny to me because actually the Louisiana project which is 2 gawatt uh they said publicly it's a10 billion dollar data center project but if you count it the same way as they count uh the target project in Abalene it could be like 150 to hundred billion dollars.

Yeah, it's just yeah, just pick numbers in marketing. Okay. How uh I don't know if you have a bunch of uh insights or clarity here, but how how are places like Louisiana and Ohio reacting in order to attract these types of of data center projects?

Are they promising uh the hyperscalers and the labs, you know, we're going to massively expedite permitting process? Is there like deregulation happening at a local level? What can you say there? Yeah.

Um I think the piece of context is that since um let's say the end of 2023 or maybe mid 2023, you have a frantic search for power happening in the US and all around the world. Um and so we've we've made some number we we've aggregated some numbers.

If you look at the pipeline um so it's you could there are different names terms like data center interconnection load Q or pipeline.

Basically, if you aggregate all of the load requests that potential data centers have submitted to the grid in the US, you're above 500 gawatts, uh you're close to the the actual peak load of the US, right? So, what's happening is pretty insane.

Those numbers are mostly fake, but what it means is that people are searching for power all around the country, and you actually have a massive competition all around the country to attract those lost products.

Because if you think about it this way, like okay, 500 gawatt of um of requests, but in the end by 2030, you're going to have maybe 100 gawatts of growth, which is an insane amount already. But it means that just 20% of these projects are actually going to be real. Um so yeah, there's definitely a lot of competition.

So people are doing everything they can to get those projects. Uh tax breaks are generally uh today the most standard thing.

uh accelerating permits like reassuring hyperscalers that you will deliver on time that you have top contractors that you're going to expedite permits uh increasingly there's um being sort of um yeah enabling more on-site power solutions as well like being being more open to people burning natural gas on site all that kind of stuff helps companies utilities and location secure those big those big projects I have a question about the shape of the super intelligence team at Meta Uh when you think about meta properties, you think about Facebook, the blue app, you think about Instagram, you think about WhatsApp, and then Oculus and VR is like a separate thing.

Quest. Um, but I was toying with this idea that maybe the super intelligence team is more like the database team or the React team. And it's and it's a it's like a an infrastructure layer, a project that will have benefits all over the place.

But we won't necessarily expect a dedicated vertical that competes with Instagram. It's more about making all the apps better. Um, I want to I want to dive into the chat app statistics that you shared. um and and try and understand it feels like it's not exactly a neck-and-neck race.

Chat GPT is pretty much pulling away in terms of percentage share of queries at 71%. Meta is way behind at 12%. Are there any signals that this is that this is um you know something that would be addressed one way or another or is it just too soon to tell?

Yeah, look, I think what we've seen so far is that generally speaking, when you start to deliver a better model, a better product, you just get more users. I think we've seen pretty good correlation between the quality of models and the usage of CHAGPD.

Uh, one thing that I think is interesting is when you look at the user base of Chad GPD, you had a surge in early 2023 and then it sort of plateaued for a bit and at towards the end of 2024, you had a second leg of growth and then you hit like that half a billion weekly users and that correlates pretty well with new releases with models getting cheaper and better and so on and so forth.

So if you think about how Meta could uh get back at maybe lead uh that ranking, well, they just have to release better products, right? One thing is to have good products and the other one is to have a good distribution platform and obviously they have the distribution platform right two billion daily users.

Uh so they just need to build a good product and I think users will come and that's kind of the the value proposition when you look at what is the state of the mom and pop uh data center market.

I don't think they would like to be called the mom and pop data center market but you know the the you know a friend or an uncle that's getting into the data center business. We asked Brian one of the one of the co-founders of Coree the other day. He was he was generally bearish.

You know, he knows he knows how hard it is. What what is the is is there a real demand signal there or is it just a hope and a prayer that you're going to just, you know, kind of flip it to a hyperscaler? And is that is that Yeah, sorry guys. It's over. Easy easy money is over.

Look, um, as we just said before, like now there's really an insane amount of competition for power and and like and basically uh like hypers skaters can set the conditions and so you have to be quite sophisticated when you want to approach hypers skaters uh and you want to sell them a gigawatt site and you also have to think about like how much money does that involve like one gigawatt you're talking about maybe 30 40 billion dollars of capex um so when you sell a one gigawatt site to hyperscaler it's not going to be a small decision for to be clear like buying the land maybe I don't know a few million like who who cares not a big deal but if they if they do make the purchase like they intend to do something about it so like yeah they they just don't want to take it lightly and they have multiple options today so uh for a mom and pop that sort of doesn't dig into what what they should actually do and all the requests and such they're just not going to get new business today but some some people made a lot of money for sure in 2023 and 2024 by just having like yeah luck they they had 50 acres near high voltage finish line.

So that's amazing. Uh yeah, it's funny to imagine you you've been building a data center, you know, a little mom and pop shop the last year and then the Death Star data center just starts like popping up next to you. Oh no, I'm crazy. Oh, it's it's over.

Uh what can you uh tell us about what what changed between Llama 3 and Llama 4 and exactly what happened with Llama 4? You call it a failure. Um, how did uh like how did that happen? Because we were I we were so scale pelled, right? A scale is all you need. Just scale up the big transformer.

Uh but you dove into it and gave so much more detail. How can you uh contextualize and explain that to us? Um simple ways just a bunch of trade-offs that weren't weren't in the right direction. U like I think you if you want to simplify it, you could say like to some extent you reach peak pre-training in 2024.

Uh I'm simplifying. I don't think it's big, but let's call it for now it's big. If it has black well ramps and so on and so forth, you'll you're going to see a new push out, but for now it's big.

Um and it means that if you want to develop better models, you don't have to use pre-training, you have to use the new paradigm, which is test time comput and reinforcement learning, right? And to do that, there are some specific trade-offs that you have to do.

And basically, Meta uh just took a bunch of options in terms of their attention mechanism, in ter of the way they route to experts and stuff like that.

just a bunch of decisions that aren't very well suited towards this new paradigm which means that their flagship model behemoth the largest one is just not very well suited to this new era and that sort of contrasts with think of the Chinese labs they are sort of the in the opposite direction because they don't have state-of-the-art chips uh they don't really have an incentive to push very hard on pre-training and so they've been they they have been thinking harder about pushing on post training reinforcement learning um and test time comput and all of that that doesn't require such a large centralized faster as such it's kind of the those two path like on one end you have meta on the other end you have the Chinese and what the Chinese decided to do is just better suited to the current paradigm so just yeah just a bunch of bad decisions or some extent unlucky uh and so that seems like that ties to this uh there was this post by Rune uh anonymous account on X saying that um there are in fact some secrets about what paths of the tech tree you want to go down you poach the right researcher, they come over and on day one they can tell you that chunked attention might be wrong for this particular training run.

Is that what you think is driving the high salaries? Is that the dynamic at play? Yeah. Okay. Yeah.

You want the decision makers that understand exactly what trade-offs are going to do that also know how to properly evaluate things or uh what kind of steps you should take to make sure what is the right uh what is the right choice.

Um and so yeah that's what you want the the decision makers that had experience that know how to do stuff and that can easily uh yeah identify the trade-offs and know if I want to go in this direction uh more reasoning more RL all that uh you should do that attention mechanism and not this.

Do you have any insight into like the shape of the behemoth project right now or like the the the the failure mode?

Like is it is it is it like it would be bad at math or it would be bad at talking to it for a long time or it would be bad at needle in a hay stack in a big context window like like we we've just heard like it's not good enough it failed but like how what would that feel like if I were to use behemoth and for a long time and I'm like oh this is weird or it's it's bad but in a weird way because it seemed like it was good at some things but then just not good at everything across the board.

Yeah, let's just simplify it and say agents. Uh so basically tasks that require using tools that require reasoning that require long long context windows and things like it's not very good at that. Okay, got it. Jordy, uh last question from from me uh for now.

What are you expecting uh out of Meta over the next six months? They have talent now. They've they have scale, but the team the new team is going to need to gel and it's going to take some time to really start delivering.

So, are you expecting a lot of, you know, public launches over over the next the basically the back half of this year or is this more early 2026? I think back back half of the year makes sense. Um, really back half like think end of Q4 or beginning of Q126.

Uh, but for sure you're going to have a few months where probably not much is going to happen. Uh, because yeah, like like as of today, we showed a bunch of pictures of the current cluster. So, they already have data center capacity.

Uh but there's still some time in order to like actually put the GPUs in place and make sure everything runs. Uh so they're going to have some uh like sizable compute that's training ready uh somewhere in Q3.

Um and so uh yeah, but actually have a product release that's more of a really end of the year, but most likely early 2026. I would not be surprised. I wouldn't be surprised if New Year's Eve we're back on the show talking like we are now about about a new drop.

Can you uh c can you talk me through some of the uh tradeoffs or how the open-source war is playing out? Uh Dylan uh from Sly analysis uh posted that the open AI model, open source model is expected to be really really good.

Um that was kind of I thought the open source strategy with Llama was a great way to be superlative on day one. It's like it doesn't need to be the best model. It doesn't need to have the most DAUs or MAUs, but it's the only it's the best open source one. And so is superlative. You get the headline.

It's the attractor for talent. Hey, we're doing something different. We've we we're the best in this one narrow thing. Uh and and there's a question about like at a certain point, does the math make sense to continue to open source?

But then when we went through the Deepseek moment, it felt like DeepSeek was very much distilled from the GPT4 API, not a llama fork. And so the whole debate over, oh, like you know, some Chinese lab's just going to fork llama and then and then improve it. So what is your kind of state of the union on open-source AI?

Uh the Chinese are eating open source uh like yeah they're just dominating the market. It's like one lab after another. It's not just deepseec. You had deepseec then you had Alibaba.

Recently you had moonshot with the kim model like they're just really good at open source sorry at LLM generally and they're open sourcing everything because it's um they're in some sort of the same position as meta like they're not leading so they don't have any incentive to be close source.

They want to build an ecosystem so it makes sense to go open source and there's just like shipping faster than meta and shipping better than meta. Uh so yeah Meta is just way behind on open source and actually the west is behind on open source. uh the Chinese are just way better at it right now.

is but is is is open source important or is it more just like uh marketing because I've always had this this this thing about uh the difference between like uh what was it stable diffusion was open source uh midjourney was not uh and midjourney was able to get the get the data back from the customer because you generate four images to me it's only the thumbs up only to me it's only open source if it's from the MRL region France but but uh but but yeah talk about the flywheel of open source like is there an advantage there or is it really just if you're not in first you might as well open source?

Yeah, I think it's more what you said is if you're not if you're not uh if you're not a leading lab you might as well open source uh because you want people to sort of yeah just help you out give you some feedback loop build an ecosystem. It just makes sense.

Uh also I would say like for the broad community generally it's good to have open source because it's better for adoption. Anyone can sort of uh yeah play with the models and develop new applications on top of it and such. Uh so yeah, I think for for everyone it's good that there's open source.

Uh but again like if you're not Google, if you're not OpenAI, if you're not at the very top, like you don't really have an incentive to be uh yeah to be secretive about what you're doing because you're not the best anyway. Yeah, that makes sense. Uh Jordan, do you have anything else? This was great.

Yeah, this is fantastic. Please hop on whenever you post anything. Uh I'm sure we'll be giving you lots of calls because this is a fantastic conversation. Really appreciate it. Really enjoyed it. All right, sounds good, guys. Cheers. We'll talk to you soon. Talk soon. Bye. Quickly, let me tell you about Finn.

AI, the number one AI agent for customer service. The bakeoff champion. The bakeoff champion. Number one in performance benchmarks. Number one in competitive bake offs. Number one ranking on G2. And we have our next guest in studio in person. We have Jesse from Coinbase coming in.