AMD's AI software VP on how agentic loops are supercharging chip performance and eroding Nvidia's moat

Apr 27, 2026 · Full transcript · This transcript is auto-generated and may contain errors.

pull my

headphones back on. We have Anous from AMD. He is the vice president of AI software and we'll bring him in to the TDP Ultra Drone in just a minute.

Here he is.

An how are you doing?

What's going on?

Good. Good. How are you guys doing?

We're doing great. Uh happy Monday. Thank you so much for taking the time. Uh since it is the first time on the show, I would love uh for you to give a little bit of an introduction on yourself and how you fit into the the AMD organization.

Yeah, I um I joined AMD about uh two and a half years ago. Um I'd um you know I came in through an acquisition uh called Na N.AI uh and we'd been building ML compilers for five six years um the precursor to AI. uh and I was at um uh Google prior to that building Chromebooks, Chrome OS and um and now I lead uh AMD's software strategy, software execution uh and trying to make sure AMD has a um you know a pervasive AI software story as much as a pervasive AI hardware story.

Got it. Uh I'm sure the question that everyone is asking you is around uh how AI is you know speeding up uh uh the actual deployment of AI models onto other silicon stacks other chipsets AMD specifically uh how is it going what is real how much acceleration are you feeling obviously we've seen uh incredible performance from AMD on semi analysis's uh inference max which I think has been renamed but u

inference x

inference x that's right and so and so like it is clear that AMD can can provide incredible inference for AI models uh and I think there is an expectation that the AI models themselves will allow for more powerful models to be deployed IMD more effectively but what are you feeling what are you seeing what are your sort of timelines for more fluidity between the uh silicon stacks that are out there.

Yeah, very good question. Um, so until about December, I was um, you know, I I I I saw it like a linear progression, right? Like I'd been here for two years at AMD and it was like hard work, grind, grind, grind, and then suddenly it was like, oh wow,

software is just tokens and time. So, you know, you know, since January, it's just like uh supercharged our ability to execute. um even performance things that we were you know traditionally took a little bit longer to kind of get for you know when we launched the MI355 um it it it would take us a little bit to like get understand workloads that are not in the well-lit path and go after the performance. Um now all of it like you know we have to be everywhere at the same time and be performant and AI helps us do that right. So we have like automated uh performance loops that just run with uh you know as soon as the customer tries the model we start an agent that's just like non-stop optimizing the customer's model right and then that allows us to not just make the uh performance aspect but also the breadth of coverage uh to make the out ofbox experience uh delightful and magical. uh and that we're seeing it in our like you know customer feedback in in what what what we hear on social media. I I track that very religiously just to make sure that the experience is good. Uh and it's it's been good so far.

Yeah, I I think I've seen you do the uh the little yachty walk out once or twice taking lab courtesy of some some AI company and and some analysis of course. um uh walk me through the key, uh like levers of AI enhancing productivity on AMD because I feel like there is potentially a world of secret tricks to model performance that are maybe not in the pre-training data and are locked in the heads of very talented folks that work at AMD and often are deployed with a company to actually get the last 1% out of uh whatever production run is going on. But then there's the other side which is uh this feels like a perfectly like verifiable reward. You can run this loop. You can sort of brute force it and uh if you're deploying a large scale model and there's a lot of money on the line, you can potentially put a ton of compute behind it. And so even if the even if the training data isn't 100% there, uh you'll get there through uh reinforcement learning. But what are you seeing as moving the needle? What what else what what's the next thing that needs to happen? Are we just purely uh scaling compute here and uh and and everything is like you know one click on AMD and everything's great. I don't even know what the benchmark is but but the hypothetical full performance is is real.

Yes. Right.

Yes. So very good question. Um I I'll just take a step back and say AMD has had this um you know um ethos of open source right which really plays to our advantage.

Every Frontier model that I use has already seen every bit of AMD source code.

Sure.

And I can even like it it will rewrite my spec for me.

Yeah.

Because it already is in the pre-training data. Right. So which you cannot get from closed ecosystems, right? Because you are constrained by like what is out there. like we publish our ISA specs. In fact, I built a virtual GPU simulator just based off of our public specs and uh and now I'm running it on the GPU. So, I can run like cross generational um you know um GPU uh simulations on a um uh on on a existing hard. So, so to your point on pre-training data, we have that advantage and and we'd run a dev day contest where we generated more tokens on AMD like Triton kernels and hip kernels. um than that existed on the internet at the time. So GPU mode had that set this up. Wow. Uh and so now that's all part of pre-training data, right? So which again it's a superpower because now you're open source and you're agentically um you know accelerating this uh uh process. Uh and then the second part is you know that's already the foundation is solid. Uh and now agent loops just you know they're working non-stop, right? Um so uh we know our roof lines and so these agents just continue and and execute towards those roof lines. Um and so it makes us achieve that. So from from where I see it right I think AI is just you know it's become like this great equalizer. I thought abstractions alone would be the great equalizer for GPU programming like you know Triton and higher level Pythonic ones but now it's like that plus aentic AI you know I have agent loops that are running non-stop you know every night uh that are you know looking at bugs PRs and and they're just automatically uh fixing them. Of course we have humans in the loop where needed but if your harness gets really robust uh it's it's good to be in autopilot. So um so I'm very confident in like the uh the enablement that you know agent has given AMD as a whole and uh how we can execute and and kind of skate to where the puck is going not all the journey of where it went to now.

Okay. Yeah. Uh that makes sense in the in the theme of uh skating where the puck is going. Uh you know I'm I'm loosely familiar with this uh you know the the the the paths around CUDA and and and and some of the trends that we're seeing there. uh what is going on in the CPU world? It feels like we're incredibly CPUbound from a physical number of chips perspective, but what are you hearing from developers and engineers on what needs to happen to unlock all of the capacity and and use the CPUs more efficiently? Is there a need for more software there? Is this just purely just make as many chips as we can like the job's finished or uh what are people asking you? That's a very good question. Um, software the job is never finished. It's just you're you're going you're going higher and higher in terms of orchestrating and and enabling the last mile, right? Like you always want to try and see how much more efficient you can make a system. Um, and like we started the thing right the discussion um AMD even on laptops right like the strict halo laptop it's got a CPU, GPU and NPU. Um and now with uh Agentic AI we are actually able to like provide a very clean um heterogeneous runtime and then a compiler so that now you can actually bounce between these based on the usage. So if it's tool calling and it's like you know it's compute uh it does it's not GPU bound and you can use it on the CPU we can shift to the CPU but then the interfaces we want to make it seamless so that it's elastic between where you want to run this. I have a a stricks halo here on my desk that's running like a local voice model that does uh my transcription. It's a it's a it's a hacked up version of codeex

um that can actually do realtime voice transcriptions. That's cool.

And I I I have um I have a small keyboard that that just has next but next session previous session and this is push to talk.

That's awesome. and and and this one here is uh extra high or high or you know the model selector and and then you can just speak to it and and so in that case

it's running all of the uh voice on the NPU right so it's able to do like realtime voice translation

and then it does uh GPU and then CPU and then where required it it puns it to you know the thinking models are up on the cloud so it gets yeah it gets a combination of all of those

yeah yeah that makes sense uh the flip side of that incredible redly cool fourkey keyboard is uh it's over for you if you if your entire job is based on the traditional cordy keyboard. I'm sure that there's a uh I I I'm curious about like AMD seems like a fantastic place to work in an AI takeoff in a crazy uh AI future world. Uh what are you seeing on uh the people that are joining AMD right now that are set up for success? What does it take to get a job at AMD? where are the new high lever positions within the organization where you see okay this person is is uh is on a you know fantastic career track inside of AMD.

Well, it's a very very good question. So um I what I do like about AMD is that you know uh it's I I think it comes from a a humble place of being through 55 years of a journey right um but then deeply ambitious and then you know at the right place at the right time you know have been executing on hardware for so long and now the software pieces like you know um accelerating it. Um so the way I tell my teams is that it's uh we're a startup. We're a 55 year old startup and you know and even in the AI group it's like few thousand people. Um I encourage everyone to work like you know startup do the right thing. It's a you know I I have an all hands with everyone who has a manager because in the future it's going to be manager of people and agents right like you're just going to accelerate that um you know uh capability and and your breadth of um what you do is going to increase um so I think the the workplace and the culture is like hey let's go do it be humble ambitious go do it and then um the AI acceleration is just something that I look for um for people to look at it as it doesn't replace first principles thinking but it can do a lot of your work right um you know and and so my my coding interview is like show me your plan your skills and solve this problem while you're sitting at it right like you know like the first 10 minutes of what you do with God code or codeex I know exactly your thought process in terms of like how you're going to approach it and what's your harness going to look Um so I I yeah I'm super excited about um the the overall team how we're adopting u agent KI but also the folks that are joining us um that have deep experience uh in in the in the fields and now you know just uh supercharging it.

Yeah. What is uh AMD's uh equivalent of the forward deployed engineer? Like at any given point, how many engineers do you guys have like on the ground at at at different data centers or even within the offices of of labs or other companies?

Yeah, that's a very good um question. We didn't get very creative. We called it forward deployed engineering. Uh we fix are very broken.

Yeah. Um so so we started that about about two years ago and um well I I guess now we should have an FDA also power deployed agents. Um so FB and FDA um but um but you know it's um multiple hundreds of people that are just u um the the the way they're focused on the customer. The way I I I phrase FDE is that um you know SD or software uh developers are the forward pass and FDs are the backward pass. So the forward pass like executes to a PRD and the FDS execute from a customer backwards and that's your entire model of operation and both of you know you have to be a software engineer to start but you know you just come at the same code base from uh you know whether you're forward pass or backward pass.

Yeah

makes sense. Um, how has uh uh I wanted to ask about the uh the the story of George HOTS and how he sort of raised alarm bells, but it felt like you know you said AMD as a startup and I did not think that AMD I didn't think of AMD as a startup three years ago but when I saw that interaction and the back and forth there and the change that actually happened uh that felt like okay this company is in founder mode, this company is in startup mode. Uh how has the flywheel of feedback from the open source, the individual, the the random Twitter poster, uh actually become actionable? Because I think there's a lot of companies that will see something being said but not necessarily take action like how culturally has AMD changed to actually uh move the needle when something like that happens.

Yeah, very good question. First, mad respects to uh George Hodson and the skills that you know he has, right? Like I I I wrote a a rock port on Mac OS just based off of like you know uh tiny grads um you know what he's done, right? So now I have Mac OS Rockcom running with an eGPU um and and that's you know that that's the power of open right like we can see and now you know the the the industry moves forward

um on the um on the flywheel I personally monitor all of X um as much as I can like all the keywords AMD software sucks or something like that so if someone if someone

if somebody posts you're going to take it personally it's going to ruin I

it's going to ruin your afternoon.

It's okay. It's okay. It's I I I take that as one of my jobs to do. Um so I personally respond um and to to you know whether it's George Harts or whether it's anyone else uh I I may not know who that is but um but usually my response is if there's a specific issue with the GitHub I will go personally track it down and make sure it's fixed. Um if it's an opinion, it's hard to fight opinions because it's an opinion and and sometimes opinions are like lagging indicators and that's fine. Um we'll earn their trust and we will you know take step by step to get there. uh but the problems that exist we want to you know like double down and go actually fix and you know we started this about a year ago where people were like oh you you removed support on this uh card or that card or you don't support Windows well and um so a year ago I I took a like a poll and and we sorted all the uh systems that we needed to support and now at least as a community supported version all of those hardware for Windows and and Linux and and Now, Mac OS 2 um is all being like enabled so that customers or developers can can, you know, use AMD uh and be delighted by it.

Last question. Is AMD a car?

Is AMD a car?

Um uh is it a car? Let me see. Um well, tell me more about the car.

It's a Formula 1 car, John. It might be a car. It's a Formula 1 car. There we go. That's the correct answer.

That's the correct answer.

It's a good car, sir.

It It is a good car. It is a Formula 1 car and the race we run, you know, sometimes we're a little two inches behind and one inch ahead, but we we're ready for the race. We run that race.

I love it. It's been fantastic following the race. It's been fantastic following your progress.

One thing I know for certain, you did not wake up a loser.

That's for sure. That's that's 100% true. Obviously, it comes across very clearly.

Thank you. It's it's

Thank you so much for taking the time. Uh the the the the news today is that uh uh annual developer day, San Francisco, April 30th. Go check it out. Uh and thank you so much for taking the time to come join us.

Nailed it.

Okay. Thank you.

Great. Great to hang. We'll talk to you soon. Goodbye.

← Back to story