Tae Kim: NVIDIA's inference wave is just beginning — Groq acquisition, TSMC wafer wars, and the coming CPU shortage

Mar 30, 2026 · Full transcript · This transcript is auto-generated and may contain errors.

bring him in, let me tell you about console because console builds AI agents that automates 70% of IT, HR, and finance support, giving employees instant access, instant resolution for access requests and password resets. And let me also tell you about Vanta. Automate compliance and security because Vanta is the leading AI trust management platform. And without further ado, let's bring Take Kim in to the TVP Ultra. Take Kim, how are you doing?

Thank you so much

for taking the time to come chat with us in

congratulations on the launch of your business.

Yes.

Thank you. I mean, it's been really gratifying that first day. You never know who's going to show up. Totally. I was like maybe 15 subscribers or 20 subscribers, but like hundreds of people showed up, tons of billionaires and tech founders. It's insanely gratifying.

Yeah, it's great. Uh,

incredible.

So, uh, is it is it over for Nvidia? They're down 21% we just read since the 52- week high. Is it doom and gloom? Is it over?

No. I mean, I think I was on last December and the stock has semis and chips that's gone up and now they're back down to where they were in December. The chip sector's

flat flat on the year. Nvidia is down 10%. And it reminds me a lot about a year ago.

Do you guys remember

everyone was freaking out about deepseek that super efficient models were going to destroy AI compute. There will be a huge compute glut and then everyone freaked out about Trump's tariff wars in preparation day and this year seemed very similar to that almost. It's like groundhog day.

We have fears over AI capex. People think that it might be the peak and then we have the Iraq war and uh one of these things is oil up here.

Iran easy to get them mixed up. They happen so

feels like the same same thing over.

Yeah.

But um

sorry to distract. We wanted to throw we wanted to

we wanted to show respect.

We wanted to show respect

to a real podcaster.

I mean it's very similar to Iraq. That's

these are great. Uh but in a $100 oil, this stuff is unsustainable and probably okay. So so so because when when I when I like the deepseek analogy and I feel like the market half digested the agentic coding uh narrative and the catrini article, whether you thought it went too far, it was too hypothetical like clearly the markets did react and a lot of names sold off. But in in a world where you believe that narrative, you would think that Nvidia would be going up. But you're saying that there are other factors at play that are sort of uh tamping down the excitement in the market broadly.

I mean there there's no doubt. Just like tariffs a year ago, Nvidia had 30% draw down when their business was actually flying actual fundamental of the business. I think the same thing is happening here with

the Iran war.

things will eventually subside. oil can't be $100 for forever and Trump will probably backpedal in the next few weeks ahead of the Trump.

So, let's uh let's recap a few of the key stories around Nvidia. We just came off of GTC and there's a lot going on uh at the company. I mean, it's a huge company. Uh maybe it'd be good to start with just uh next generation chips, changes to strategy, what people are actually buying. Maybe that means uh Grace CPU standalone sales or the the development with the Grock partnership. What's sticking out just on the actual AI product side to you that you're most excited about?

Well, inference demand is exploding driven by the AI agents, authentic coding assistants. Um what I met with uh Ian Buck, I met with dozens of uh engineers at Meta, Google, Nvidia and all of them are seeing crazy uh inference demand and AI compute shortages. So across the board, people are in crazy clamoring need for AI.

And and we're I mean we're Yeah. You're seeing that from talking to engineering leaders at big tech companies, but we're also seeing it from Vibe coders who are just on X and Twitter and talking about how they're hitting rate limits and they're they're subsidizing. They have multiple plans and they actually shift around from one model provider to another just to make sure that they're getting the tokens they need to build whatever they're building. And you see the tweets like people are like um

building uh bots to to pick up any kind of B200 GPU that can that they're waiting like weeks and months or whatever.

Like sneaker bots but for NeoCloud.

That's crazy.

Exactly.

I can't believe that.

Um and the the the great thing is Jensen, you know, he's very precient. He probably saw this demand months away. He locked up all the the supply agreements for memory co-ass, you know, connectors ahead of time. you saw this inference demand and uh to take advantage of this coding system uh boom which it's almost like a gold rush you see open AI pivoting toward it anthropic obviously is thriving on it uh billions of ARR every every few weeks yeah uh and uh Jensen uh acquired Grock um acquired the assets of Grock and the people of Grock

and this the combination of integrating Grock's uh technology together with Vera Rubin uh lets Nvidia has served this tremendous wave of compute demand economically and uh Ian Buck talked about it, Jensen talked about it. So Nvidia is positioned perfectly to thrive on this coding agent wave that we're seeing right now

on on the Grock deal. Uh Jensen did a fantastic interview with Ben Thompson and uh was sort of asked the same question two years in a row about A6, the threat of AS6, the idea that the GPU, the general like like general architectures can truly satisfy 100% of demand. It feels like there's a shift in Nvidia's strategy there. Do you see that? It feels like the right move, but do you do you see it as a shift in the philosophy of the company or the strategy or or is this just something that the gears have been turning for a long time and this is maybe just an unveiling of a strategy that makes a lot of sense and has made a lot of sense for a while.

I think what Jensen does, he sees where the market shifting and where the economic value is. With Melanox, he did this in 2019. He saw world shifting to

it's a networking chip but he saw the world shifting to like these 10,000 100,000 GPU clusters and melanops need for that in the same manner he saw uh AI agents and

and the inference behind that taking off and he said oh this grock thing will work perfectly rub it it doesn't replace everything and just talked about 25% of the

inference demand would be uh grock would uh work on that But them working together where 75% of the inference is ver Rubin, 25% is a Grock low latency stuff. That's it's like the perfect combination to to take advantage of this.

Um and the other thing is like we're just in this great liftoff of AI innovation. Yeah,

we've talked about anthropic mythos, the blog blog post that leaked out. So we're going to have this, you know, step up function. They told fortune it's going to be a huge step up change. Yeah,

open AAI is coming out with their model soon. And then when I went to GCC, the biggest takeaway I had was this uh session between Jeff Dean and Bill Dally, both chief scientists of Google and Nvidia. And it's it's online. I highly recommend uh people watch it. And he talked about Jeff Dean talked about um the context have context window innovations where they could focus on the 10,000 documents that that work well with your your request and query. So we're going to have this context window innovation. Both chief scientists talked about uh stacking memory right on top of the GPU or TPU and that's going to be a huge innovation uh in the coming months or years and uh so you have and then Jeff talked about uh synthetic data uh for audio and video there there's this huge runway that data is not over and then they're going to be able to take advantage of all all this data that uh people don't realize yet. Um, so you have like all these vectors where AI models are just keep getting better and better.

Yeah. How are you processing the idea that Nvidia will be investing in an opensource frontier lab capability? That feels like potentially competitive with some customers. Nvidia's like never really been in that market before. Uh, but at the same time, I've been the biggest uh, like supporter of open-source American AI models. I loved when Meta was doing it. I want more of it. I loved when OpenAI, open source, GPTOSS. It feels really, really important, really great, but it does feel like a strategic shift. How did you process that announcement?

It's it's not a hu I think it's like 25 billion over the next few years, which doesn't really compete with what OpenAI Anthropic doing. Yeah, I guess these these smaller models are going to be helpful for people running these smaller use cases. So, GPUs as long as they're utilized even locally or in the cloud, Nvidia benefits and saw the the the top people at Quen

um left and we don't know where they left to. Quen is an amazing model. It's kind of like what deep is what people thought deep should be when works well locally. Um it Gwen kind of subsides because all the people

what's your theory on what's your theory on where they all went? Another Chinese lab or

I asked all the engineers when I was at GTC no one really knew but people people are trying to say Nvidia should actually hire them because the more capable open source model Nvidia doesn't care if you're using GP to run open source or not. They just want, you know, more AI adoption across the world.

Yeah. And and Nvidia has more uh probably more levers to pull if it turn if it turns into a negotiation with China. Like we're we're we're tracking like the Manis story with Meta. And there isn't that much that Meta can give to China in exchange if there's like a hey like let look the other way on this particular deal like let this one flow through. We'll ch trade this. Meta not really doing any business there. that Nvidia of course is going to be uh selling black wells at some point in the near future and there's probably some level of pricing you know it can be part of a larger discussion which makes a lot of sense

and one thing that kind of went under the radar Jensen literally said at GTC they got license approvals on both the US and China side we're going to see billions of dollars of H200 orders

okay so yeah I mean it seems like it seems like there's a path on the demand side that's very very clear you've mapped it out a few times times. It's a huge number. Um, it's already massive revenues, just an incredible growth. But, uh, what is what is the supply side looking like? Because it feels like TSMC is not ramping capex nearly fast enough over the next few years. And if we see another 10x increase in compute demand, uh, we could be really constrained on the leading edge bib uh, fab side. Uh, so how do you think Nvidia is going to process that? Well, Nvidia is in the driver's seat because Jensen goes there five, six times a year and his best friends at TSMC and speaks at their employee day. So, they're going to get higher, they are getting a higher allocation to wafers and cos and all that stuff. So, and they will benefit. But I agree with you that industrywide like

Google is dying to get more TPU wafer capacity.

Sure.

Um, all the all the hyperscalers that have AS6 are trying to get more wafer capacity. So there is going to be a AI comput shortage uh in the years to come uh just like you said. Yeah. And Nvidia just benefits because you know they're the biggest dog in the house and they can uh uh prepay uh tens of billions of dollars to get the allocations they need.

Yeah. I mean maybe there's some offtake in AS6 that can potentially be fabbed somewhere else at some point. I I don't I I know that a lot of the ASIC companies wind up fabbing at TSMC, but uh it feels like if you're already doing some sort of rearchitecture, maybe there's a way that you can get you can squeeze something a little bit out of uh you know a an Intel deal or something else. I'm not exactly sure, but

Samsung and Intel are the only Samsung and other fabs that can possibly do it.

Yeah. Uh

that's the bookcase on Intel.

Yeah. Yeah. is that is that at some point the labs and and Google like like across TPU extra GPU capacity Nvidia the new AR like there's just so many buyers of lab capac capacity now that you could imagine everyone coming to the table potentially in Washington DC or Mara Lago since the US government owns a slice now and everyone saying okay let's hold hands and jump across this and say that if the if the if the supply comes online, we will buy it at this price because we have really really solid use cases that that will justify the investment for us and for Intel. So that would be a really really good case. But again, even if the money is there, how long does it take to get to, you know, good production numbers?

I mean, I suspect like Apple and Nvidia are considering um either Intel or Samsung for um their lower-end stuff. Uh yeah, whether it be like a mid-range iPhone or on Nvidia side definitely their consumer gaming GPUs, they they might go back to Samsung and maybe even Intel.

Yeah, I have one more, but go for it.

Uh I wanted to know how you're processing the ARM CPU announcement. Uh it's an interesting dynamic because they're sort of frenemies with Nvidia now. They're competing in many ways to break the x86 monopoly uh because they both are selling ARM CPUs, but then they're also competing uh and so I'm wondering how you think that plays out, what that means for Nvidia and just the rest of the semiconductor supply chain. I

I think ARM is uh this their CPU opportunity is a longer term, you know, for even they said 2030 2031. Yeah,

it's a longer term opportunity. I don't really expect the major hyperscalers like Amazon to switch to ARM's you know uh product offering. They have their own and same with the same with Nvidia. They have their own ARM CPU that they're they're going to incorporate and sell. So it's not that big of a um I don't think Amazon or Nvidia are really worried that ARM is going to take uh any big share. It's probably going to be on the margin for companies that can't develop their own ARM CPU. uh the more the uh mid-tier hyperscalers or enterprises that use these things. But I I think the ARM thing is very important because it kind of confirms what the biggest underlying uh thing that that's not really consensus yet is this massive CPU shortage that we're seeing just over the last few months. We have Dell, AMD, uh Intel CFO talked about they're talking about three to five year locked locked in supply contracts from hyperscalers. So this this is a major trend uh that's going to go over the next few years and the reason why is AI agents need more CPUs. The ARM CEO talked about four times more CPU corder cores versus last year's kind of a AI infrastructure model. So we're going to see this massive uh demand for CPUs that people aren't really understanding yet because uh AI agents the whole thing requires orchestration tool calls database queries web searches and that's all handled by uh the CPU.

Yeah.

Give me your bull and bear case for Terraab.

Terraab I'm not that optimistic. I mean, it's so hard to

give me the give me do do your absolute best to give me the bull case

because TSMC is so short that you know Elon needs to find but even then like how they going to buy like semicap equipment from ASML and AAT like there there's just no capacity there. So I'm I'm not optimistic on that.

And this is this is stuff that takes decades. Um, chip fabs is almost like cooking and it's not like some you could just follow uh follow a manual. It's like it's almost like cooking where it's takes a lot of uh trial and error accumulate over decades uh TSMC and even Intel. So, it's not something you could just jump right in and do. Yeah, it somewhat goes back to the Yeah, it somewhat goes back to the XAI debate about like do they need AI researchers or should everyone be an AI engineer? Like are we in a research period or a you know the the Ilioskiver age of research versus the Elon Musk age of engineering? Where are we in uh semiconductor production? It feels very engineering like like an engineering process. But what we've seen from ASML is that it and uh and TSMC is that it does feel like there's a little bit of research and art artistry to it and the cooking analogy. Yeah, I've been doing a lot of research in the space and it's a lot of trial and error and like almost like cooking a recipe

and and and it also feels like in at least with XAI uh if all the researchers are in San Francisco, you can sort of just like walk across to the coffee shop, poach someone, but if if the best if the best uh you know uh semiconductor engineers or or uh technicians are in Taiwan and they see it as a national uh you know urgency to you know bring you know stability to the country both economically and geopolitically then you have a very different calculation it's like oh yeah I could make five times as much if I left my home country to like be abandoned that's a very different calculation and uh and everything that I've heard about the culture at TSMC is that uh the the folks who work there are extremely dedicated beyond the economics they are true missionaries uh not necessarily mercenaries and so it does feel like it's harder to do like a talent raid in in the leading edge fab world than even the AI world which is extremely competitive and there are still tons of missionaries but fab

I guess I guess another question I have is would you expect uh would you expect XAI/ SpaceX at any point to get to basically just open up a shop as like a neocloud because the thing that was like probably the one of the least compelling aspects of of the Terraab pitch was him just saying, "We need all of this compute. We need to do this because we we're going to be so chip constrained. We're going to be so supply constrained." But there was no explanation of

where the demand was coming from,

where the demand was going to come from. Is it going to come from

training Tesla models, Optimus or Grock or

Yeah, it was it was just very unclear.

It was a lot.

But there's even the question right now is should XAI be kind of renting GPUs? I don't know. I don't know. renting out GPUs because the biggest win has been Colossus 2. Yeah, Colossus 2, which was built very fast.

I I I think Elon's pitch with the SpaceX IPO and we'll see it in the coming months is the AI compute. It's going to be so there's going to be so much demand over the next 5 10 years that you're going to have to uh use the SpaceX satellites that have GPUs in them um to to serve that. And and maybe maybe I mean even though Tesla's been vertically integrated to the point of being a consumer product, SpaceX has not. It's been a railroad and there is a world where you fab the chips, you put them on satellites on Starlinks in space and then you let other companies do whatever they want with those GPUs.

Think what Elon did with Starlink. I mean that's a telecom infrastructure play and this will be an AI computing infrastructure.

Yeah. Yeah. Yeah.

Fits that model.

There's a world there. Um

I'm not going to bet against Elon. It might just take long.

Yeah. Yeah. Uh what about what's going on with helium? What what are you tracking there? There's chatter about uh helium shortages potentially.

Jensa has talked about this. This is a risk,

but

there is probably like six months, six to nine months of inventory in the channel. Bernstein has talked about it's not a risk in the short term. So, so if this thing if this Iran

stuff lasts

in,

you know, two, three, four, five months, then becomes a problem. Okay. But if it, you know, gets solved or opens up with the toll or whatever uh final negotiation they come up with over the next few weeks, I don't think it's going to problem.

Yeah, I do think that like like most of these uh materials there are uh extra deposits. They're just not economical to mine. I don't think that all the helium exists in the Middle East. That would be

it's similar to the rare thing just like you said.

Yeah. where uh in a supply constraint scenario, it becomes more economical to mine American helium.

Let me put it this way. If helium becomes issue, we're going to have bigger problems on our hands. Okay. I mean, there's going to be world starvation.

Let's hope not. Let's hope that that'll be the least of our problems if helm becomes a problem.

Take me through uh depreciation gate. How did you process that? And where do we stand now with the fear that GPUs will depreciate precipitously and H100s will be worthless in six to 12 months?

It's it's totally not a problem right now. Like Core Wee has talked about uh these things are lasting five to six years. Uh and they're getting like almost 90 95% of the pricing. So

it could be potentially be a problem if the whole if this is a bubble. I don't think it's a bubble. Yeah. But if there's a bubble two, three years from now and there's a compute glut, then

yeah,

know the stock's gonna go down because there's a compute glut. But as of now, it's the opposite. Like you all the GPU rental prices, even for stuff that's six years old, is still being sold out and it

the AI compute demand outpacing supply is so large that this is not an issue right now.

Do you have any theories on on where the next step change in token demand could come from? because right now we're seeing it in codegen and there's a lot of optimism around uh these types of workflows being applied to other forms of work but we were talking about this on Friday like even if AI can just oneshot beautiful financial models it won't necessarily even make a real dent in token demand at least compared to to codegen because no company needs to just constantly be you know be generating ating uh models at the rate that let's say Gary Tan generates code. Um and and so I'm I'm like kind of been trying to wrap my head around where where could these incremental um use cases.

I actually think codegen is still just early innings like

Yeah. And I I don't disagree with that.

10 20 agents and they're kind of over uh overseeing them. But then we have this other stuff where these models the mythos and open AI they're just going to get better where you could automate uh all these work uh process flows companies are going to use them for every single vertical customer service research simulating chip design where they they can verify drug discovery where they verify uh drug molecules can do so so we're just getting started at this stuff so you can you're going to see vertical AI agents on every single category and I think a Logan's coming out. He wrote this great post on

X. Yeah.

Um that he says this the AI agent wave is is going to

kind of uh attack this $6 trillion knowledge economy, right? It's not just about programming anymore.

They're coming for us.

Yes. I don't think See, I'm actually

They're they're attacking the key context economy and the TVPN economy. No, I I think it's it's it's like a calculator or a spreadsheet. You know, 30, 40, 50 years ago, we had like, you know, 50 accountants do doing the spreadsheet manually, right? And now after a spreadsheet came, it didn't get rid of all of knowledge work. It just uh enabled people to think at a higher level and get more done. And yeah, I'm very optimistic about that. I mean, one one one way that you 10x token demand on around a financial model without 10xing the number of financial models that you're building is having the agent go and collect 10 times as much data. And so there's a lot of situations where uh I mean you you look at like hedge funds that want to understand uh the price of Walmart stock. There are hedge funds that will task satellites to take pictures of Walmart parking lots, estimate the number of people on a day-by-day basis that are going into the Walmart to shop and then using that as a proxy to project revenue and then flow that through to cash flow and then flow that through to the DCF and the actual valuation of the company. And if you think about all the different financial models and all the different businesses where you could go and say, "Well, for this company, I need to go to every single local, like I want to know the price of Squarespace. Let me go to every single website that's powered by Squarespace and estimate the revenue that they're bringing in and their willingness to pay for their hosting service, something like that. And all of a sudden, like it's just one spreadsheet at it's just one number at the end of the day, but it's like a thousand times more work went into it."

Let me give you this great example. Um, every year I do this a um the same store sales for uh these fast casual companies. So like Chipotle, Cabba, and I put out this tweet. It goes viral. Uh a year ago when I do it, I would have to manually go to every IR website for these six fast casual restaurants. Yeah. It would take me like an hour or two. Yeah.

Um I would try to use a chatbot, they would get it wrong.

Sure.

I did it like a few weeks ago and all the chat bots got perfect. So, it just saved me two, three hours of tedious manual labor. So, that that's only going to get better and better. Like,

yeah.

Yeah. And it's only going to take you one like like this year is the year that you you do it with multiple chat bots and you fact check it yourself and then forever it's going to be just one prompt and

and it get it got it right. A year ago, it wouldn't get it right. But now, in one two minutes, I put give me the same store uh sales for these six restaurants. Yeah,

I put in Gemini, I put in Chachi PT and just to to make sure they're right and they're right. So that all all the tedious labor, all the manual labor, all the data entry that you know all of us are used to. Um that stuff is going away and we could think higher level. So I could look at the same store sales and say, "Oh, the economy is at risk and whatever." But all the grunt work, all the tedious work is going to be taken care of uh by these AI agents.

I agree completely. I agree completely. Uh

we got a lot of a lot more sound effects since the last time you joined. Uh last last uh last question for me. What's your outlook on on meta? It feels like the the broader market right now has zero faith in Meta to actually put

all their AI investments to use.

I had I have this history with Meta is that you know every time it starts falling apart I say it looks cheap and then it goes down another 30%. But nothing has changed. Like no one's going to replace better digital ad position. I mean

like I would even say in the AI world they're even better positioned because Google might lose digital ads share to AI chop chat bots their search position going future. So like no one's going to replace Instagram, no one's going to replace Facebook. Billions of people are still going to use those uh social media apps. And uh you know it's every six 6 months to 12 months everyone goes through this bare meta cycle but their pure competitive position really hasn't changed and you saw what's happened to Sora right like you know everyone's all excited about Sora and and that that got

totally yeah and and there's just this world where even even if like the AI spending is like a side quest it's like really they just pulled forward like three or four years of capex and they will use that for their other products it's probably even less like wasteful than reality lab spend which might take even longer to realize the cash flows from like they can recoup. Okay, we built this massive data center, we did this training run, we didn't get to the frontier, we're not getting a lot of like genai usage, but we can apply it to our ads platform and tools and reels recommendations and a million other things just in years 2028, 2029. And yeah, we're a little bit ahead of schedule

or ad engine monetization

100%. Yeah, the gem model

reality labs may he may have wasted 70 to 80 billion dollars. He might waste a hundred billions of dollars on on these a frontier AI models or ad engine core business that money making engine has is not going to be affected by this.

Yeah. Well, thank you so much for taking the time to come hang out. Always a great time. T uh go subscribe to key context on Substack. Follow take him on social media. First adopters.

Join the many beaires that were the first adopters.

Yes. Yes. You'll be in good company and thank you so much. We'll talk to you soon. Have a great week.

Great to see you guys. Cheers.

Let me tell you about Figma agents. Meet the canvas. Your AI agents can now create modify create and modify your Figma files with design system context in beta starting today. And let me tell you about graphite code review for the age of AI. Graphite helps teams on GitHub ship higher quality software faster. So

uh Chimath.

Yes. Holly says the biggest threat to Instagram's moat is an incredible image model.

Okay.

Ze Zephr says metabot.

an incredible image model. It should.

I mean, that's basically that's a that's a like you're basically saying, okay, if Sora was if the content on Sora was a hundred times better,

would that be a real threat to Instagram?

And I still am not I'm still not convinced.

I feel like a lot of people have uh their network there. They want to share with their friends. They have a graph there. And even though the recommendation, like the content doesn't come through the graph anymore, having

← Back to story