Shortcut launches as an AI Excel agent that beats Goldman and McKinsey first-year analysts 89% of the time
Jul 28, 2025 · Full transcript · This transcript is auto-generated and may contain errors.
Featuring Nico Christie
a daily driver and then a backup. I mean, we search a lot about like true crime stuff. I don't want to be flagged in some database. I want to do that inference locally so that uh I don't get uh subpoenaed or anything like that. Um anyway, we have our next guest here in the studio. Welcome to the stream. We made it.
We have super intelligence and Excel, but we're still figuring it. We're still figuring out the room. Wi-Fi is coming in a long time. What's up, boys? How you doing? What's going on? What's going on? I couldn't bring the suit. I thought the white coat was the next best. No, no, white's great. The market's up.
We're wearing white. It's very good. It's a good option. Okay. So, break it down for us. What are you building? Fin Twit was freaking out this morning because they uh they all thought they thought you paperclip them all. Okay. Um so, yeah, talk about the launch.
So, uh, Nico was on the show a while back talking about kind of the pre-release. Um, and the the app is live today. Sounds like you've improved it quite a bit as well. So, yeah, break it down. Yeah, sweet. Thanks for having me, guys. Um, yeah, Finwit's having a meltdown.
Um, they're having the kind of moment that we had August 2024 in software engineering. Um, Carpathy tweeted, you know, cursor is kind of there now. Yeah. Um, yeah.
And what happened was there was a lot of hesitation from senior devs and I think a lot of the junior devs became really really good really really fast and there's a word for them now like context engineers and I think where we are now is there's this moment's happening for Excel.
So our product's called shortcut and it's a superhuman Excel agent and it does feel like that August moment for some things it's unbelievably better than humans and for others it's just like stupider than your intern. Yeah. Yeah.
Isn't the isn't the the dynamic of like the finance world and the Excel user radically different than the than the software engineer?
I feel like if you're if you're an amazing software engineer, you can work your way up at a big tech company, become like a senior engineering fellow, and basically always be working on algorithms and code and software your entire career and become fantastically wealthy, become an AI engineer that makes $100 million.
But in finance, you go to an investment bank or a hedge fund or a private equity shop, you're working in Excel, and then you level up and it becomes all about relationships.
And so you don't actually like when I think of like a managing director at you know KKR or something I don't think oh wow that guy's really good at XL I think that guy knows well he probably used to be probably exactly used to be. So yeah how does this dynamic actually play out? Yeah. So that's a good call out.
Um it's actually the same in tech but there's like two paths. You could go you could go tech lead path and tech leads are like you know doing some wizardry on wherever algorithm or backend infrastructure.
You could go director path and it's just becomes like you talk about or you brag about how much headcount you have under you. That is like the only thing that exists in finance.
And when we show people in finance this, their first reaction is like like it's kind of [ __ ] because like Excel is the reason that I'm where I'm at now. And that path doesn't seem like it's going to be a thing. Um it is different.
I mean it feels like you'd like even I feel like a lot of people uh entrepreneurs don't want to be vibe coding themselves. They want to hire somebody who's really good at vibe coding and they want to get the value of a senior engineer for the price of a junior engineer.
I imagine if I'm an MD and an investment bank and I'm like great uh I can even more quickly turn over a turn of the DCF to my client.
Uh and yes, I don't mind keeping the the the the Harvard grad, you know, up all night on the weekend, but if he could he or she could get it back to me in an hour instead of 10 hours, I'd be happy with that. Yeah. Yeah. And you'd still rather that person work 30 hours a day and just send you more. Totally.
But yeah, we know this because of tech. Like this happened and it was actually much scarier 2 years ago. But whenever you can drive the cost of inputs lower and your ROI goes up, like the obvious thing to do is just put more gas on that fire. Totally.
Um it's just it makes sense that it's scary as in like we used to have tools and now we have things that can use our tools, right? So um I think it's inevitable that they're going to learn what we learned and that's how I'm going to position it.
So, we need to go to the finance community and teach them about Jevans Paradox. I I'm glad somebody's bringing this up. We're going to host a Jeans Paradox happy hour. Uh rooftop bar, you we're going to be we're going to be in New York.
I was about to say we'll start we'll just go on the street if we see someone dress a little circuit. Yeah, we should. Are you Are you in New York? No, we're in Menlo Park. I'm the only person in Menlo Park who knows finance, unfortunately. Hey, what about the venture firms? They they need to know.
Uh if you're serious by 100, you go to private equity or VC or private equity or hedge funds. Yeah, sure. I was like, wait, what? No, no, no. You put it in the Excel sheet. Multiply the revenue by 100. That's the value. You do a little bit of this and you close your eyes. That's the valuation. Slap a multiply by 100.
Doesn't mean I need a model to to determine the valuation. It's I'm just going to pay slightly more than my What is the shape of the user base?
uh where is Excel actually like powerfully used because I feel like so many of the hedge funds they used to do stuff in Excel then it went to VBA then it went to you know high frequency trading and proprietary systems and I'm sure there's some stuff but it's usually in like private markets less liquid markets like where where is XL holding on really strongly um you know I'd say hedge funds are like the toughest customers um anything that they're like really heavily relying on like already low-level systems or real-time data to public access markets like that is a tough customer.
Thankfully, they're like a small sliver and I'm not sure I'm building for them. Um, but anything like private equity where you're getting a SIM and you need to build a model on it or a mini model, anything where like you have a template, you want to update it in light of new information.
Corporate real estate's like a bizarre area where there's a lot of PMF. Um, but it seems to be that like building new models that be 100 times faster. Now, editing existing models is a little tricky.
I mean, if you really want to swap a model for like new data, that's kind of like the highest bar that you can possibly have, but seems to be solvable. Sure. So, what's the workflow now? I'm in PE. I have a SIM.
Uh, and for those that don't know, that's basically a bunch of PDFs and data showing like showing what's actually happening in a business that might be for sale. Can I drop how how soon? Let me walk you through. Um, and that's exactly right. So, it's like a confidential investment memo from, you know, the sell side.
A big bank will send it to a PE firm or they'll send it to a bunch of them. Um, typically what you do is like you try to analyze, you diligence the deal and you look at this big PDF full of data and you try to model it to like understand is this a good deal, should I bring this to my boss or whatever.
You're usually constrained by how fast you could do that analysis. Um, currently now you can just attach a PDF of a SIM. They're usually PDFs or multiple PDFs or customer data dumps as well. And you can just say like, hey, here's my template. I use this for my LBO model. Um, make me 10 of them.
And in fact, now you can also just email it and be like, I want a hundred. Yeah. Um and you can get all of that or you can attach a 100 attachments and ask for 100 models back and then it becomes more like how fast can I review data more like how then how quickly can I do ground work. Yeah.
Um what about just like FPNA orgs in like small and medium businesses? I feel like when I've run companies I have like one finance person and we have a question uh should we expand into this market or should we add a new skew or should we do something? Should we take out this line of credit or whatever?
And usually there's like an Excel model or Google Sheets model that's like built, but it's it's like not even there's not even like a canalist for pull a model off the shelf. It's like no, I need to just be in Excel and model this out or even like CAC to LTV. Is this how is how are my KPIs trending?
There's so much work that goes in the long tail of Excel. Is that just like not the early market for you or do you see people abusing the tool? I think the cool the crazy thing is is I do these live demos and people are like, "Oh, that's insane. " But is this just DCFs? Yeah. Yeah. Yeah.
And I'm like, you know, we are in the tech world, so we think tech is the center of everything. But anything you can possibly do in Excel is way more in distribution than like Ruby on Rails. Yeah. Right. Like there's no there's no kind of model you can conceive that is like unfair to ask.
It becomes like what are the fundamental limits of AI? And it turns out like writing Excel formulas is a lot easier than writing like amazing code actually. So it's more more or less like how did this not exist at this point?
But yeah, I have to find you said uh you said shortcut beats first year analysts from McKenzie and Goldman head-to-head 89% when blindly judged by their managers. We even gave humans 10x more time. What what did that kind of like benchmarking exercise uh what did that look like?
We gave we gave multiple first year incoming analysts from Mackenzie Goldman, BCG, um JP Morgan, a couple other firms um like five tasks and we thought like they were the most in distribution for you know classic finance work.
So this was consulting like building op models, M&A models, LBO at DCF and like personal hobby stuff and actually even product management like build a dashboard over this customer dump. Um we then gave them over 90 minutes per task.
Um some of the hobby ones had like 15 minutes and then we just asked them to submit it and then when we did that we also got in touch with managers from these firms and just sent them a Google form with like sideby sides completely anonymized and just pick your preference according to like accuracy, professionalism or whatever.
Um I thought it was going to be closer to 50/50. I think what I really learned is first year analysts are like kind of lousy. Like they're not that good at building Excel models. Um and it's and then also I think people just take way longer to do things.
I think if I had to critique the study it would be that like five hours is not enough and like they would have liked to have a week one person wrote me. Um but I didn't want to run that study. Yeah. What's your read on the meter uh data that uh these coding agent tools did not actually speed up uh software engineers?
Did you see this? Uh so meter did a blinded study. It was small small number but it was interesting because it was very solid software engineers working on advanced bugs in o in like big open source projects.
Not vibe coding a landing page like like truly like there is a sticky bug hanging out in some fundamental repo out there. Go fix it. They estimated that they would be 20% faster, 30% faster, something like that. Turns out they were 20% slower.
I'm wondering if you'll see a similar pattern where there's a speed up for bootstrapping and getting from zero to one on a project, but then for the really crazy person that already has all their macros and doesn't touch the mouse, maybe they're actually going to experience a slowdown. Do you think that would happen?
What do you think about that? Yeah. Yeah, it's a great question. I saw the study. I didn't read it deeply. Um, what I will say would be bad for you. It's just like it's just an interesting I think the obvious thing is that it's like using AI is a skill also, right?
So like the top startups have like the best context engineers in the world. They know how to optimize the KV cache hit rate whatever it is. Um if you kind of give like even a really strong engineer who's been coding for 20 years you're like now you have to use this tool or use it however you want.
Um I have no doubt that even the best engineers and the oldest engineers like eventually would find patterns in which it's massively unlocking for them. So I think that is what's going to happen.
But I actually to be like the most critical I think clearly Excel modelers are way less technical and way less on the frontier of tech than our software engineers.
So it's actually more on me as like a product person to like how do I unearth some of these capabilities and hide the ones that they're not ready for um and even train people for this because it is 2023 all over again. Yeah. How how templatable is the work?
Because I I know Canalyst sold to Teus which sold to Alpha sites I believe and canalyst was kind of like off-the-shelf financial models some data integrations. Capital IQ has had integrations.
You have a Bloomberg terminal you can have integrations and so there's almost a world where you want to like sit on top of a of a library of templates. In some ways, the way like deep research is clearly RLHDF on like a couple templates of like what a report looks like. It loves tables.
And so I I don't know if you want to do like fine-tuning or like have a model that selects a pre pre-filled template and then you're using your tool just to update that template. What do you think about that? Yeah. So let's break it down into two things.
One would be like product and then the other one would be training.
Um as far as product is go as far as product is concerned like I think the most important thing you can do is let users actually upload their templates like your income statement is going to be different than you know or pinkus or whatever um and then just let them work naturally from theirs but then from the training perspective you have to collect as much of this data as possible and train your models.
Um kind of definitionally if you're using frontier models you can be limited in what you can and cannot serve in terms of trained models. Um but it's it's actually a much more constrained environment. So like I think we stand to benefit benefit from it much more than does like operator or you know deep research. Yeah.
Yeah. Makes sense. Jordy, anything else? Um where how do you think the the product uh needs to improve going forward? It sounds like in the in the study uh in the test that you ran with the the incoming analysts, it performed very well. Uh that's kind of like a a unique situation.
What's it going to take to to get it to the point where it's actually disrupting the job market in you know in in in some of those roles? Yeah, it's a great question. Um, clearly right now I think what it does is it makes like maybe you guys are rustier on Excel that you used to be.
Like it makes you much better at Excel and if you're kind of a rookie or like you're a solo like prneur and you want to like have a oneman finance shop like you're much better now already with it. So for that like we passed the capabilities threshold.
The big question becomes like for like real enterprises who people hardcore use it, are we there? And I would say like if August is that moment, we're like in June.
Like it is a pretty clear capability threshold we have to pass and we're actually actively adding all of the limitations to our internal benchmark and just hill climbing them as aggressively as we can. Um part of it becomes a little bit of a product science of like how you can abstract the limitations away from them.
Um, but like more specifically, it's exactly editing and overwriting large nasty existing templates and files with new data. It's like, yeah, if you don't do that at a certain percentage of accuracy, it's just a non-starter.
Yeah, I feel like part of the challenge will be figuring out like giving the user tools to figure out where shortcut is hallucinating, right?
if you have like part of that I I I remember you know working with with um you know I'll get a model made in the past and I just look at it and I'm like okay everything looks good but like it's definitely off and like it's just like it's too good or it's too bad bad model smell yeah exactly um and then you dive into it and you find like one or one or two kind of reasons reasons for that um and I think like the pe people that might critique Shortcut today are going to say like, "Oh, I used it for this or that and like it missed this column or whatever, you know, um, that kind of stuff.
" So, I mean, in terms of how we think about it, I'll tell you how we think about it and then like what actively we're doing about it, but it's like this is not a new phenomenon like in coding it had to get certain good and it had to be certainly like at a certain level of observability and when cursor like brought the apply diff function, it actually made the job of supervising AI like doable like I used to copy blocks and it would be I wouldn't know what would break.
Um, that was the big thing. was actually observability, not accuracy. And then Sonic got good enough. Um, in like self-driving, it's safer than we are, but we still don't want to make that trade because we want to be like that guy killed that person, right? Like we need to be able to blame people.
Um, radiology, it's there too, right? Like we just don't know who to sue if something went wrong. Um, I think for finance, like exactly what you're talking about is we had to have a UI that allows for you to observe the diff. Um, and like we had a good inspiration, right, cursor.
So actually now it's part of the launch video I just I just shared. Um, and I know we were talking about like product demos, which I have a strong like, you know, opinion about.
Um, like the whole like you have to be able to see everything that's changed, but also the root material because in Excel definitionally some things are hardcoded, but you want to know where those came from because you can't justify it, right? Like where what part of the 10K did it come from?
What web search did it come from? What's the exact quote? Like what page? So if I I really do believe that it's not about getting to 99. 9% accuracy. It's about like getting to perfect traceability cuz like you know your analysts suck. Like even your associates are not good, right?
So you just have to be able to to observe it at a superhuman level. Do you think Well, yeah. And the the good thing for analysts is like you combined with a product like this could actually get to the point where you're truly elite. Yeah. So I know people are afraid. Let me hit that.
Like clearly analysts should be using this or a tool that will try to do what we're doing. Um no doubt about it. It's the right launch is a non-zero amount of controversy, right? Like I know I know that's an emotional reaction that people have, but clearly people will be using this the same way.
Like if you don't use cursor now, like you don't know what you're doing, right? Yeah. Yeah. Yeah. Do you think you're not afraid to trigger trigger a couple of incoming analysts that are say I used it and and it did this wrong. It's like well you probably would have made a mistake too. Yeah.
Do you think you'll face more competition or that there will be more war rooms planning to compete with you from the hyperscalers who have products that they want to add this feature to or the foundation model labs who see hey uh you know what I can maybe I don't have control over a consumer application but I can do a lot of these calculations in pandas in Python.
Yeah I'll answer those. Yeah those two are very different. Yeah. Um, when it comes to the hyperscaler, it comes down to can they? Mhm. Um, I think they would if they could, right? And they would have already. Um, I think it comes down to like inertia, talent even. Um, then that's not true for open AI, right?
Uh, but what's what's the problem there? I think it's going to be a bitter competition is the truth. Um, then you have to ask what principles do you really believe in and will it's like prevent you from doing what I'm doing, right?
Like I think even if you made an effort to copy me or beat me or directly compete with me, you'd have to believe the things I believe in like wholeheartedly.
And I think most of the people at Frontier Labs I know cuz I work at a small one like really don't even care about Excel or they think that like we're all going towards an input output model anyway. Sure. Right.
That like you have to do a general training like paradigm to bring you just a model out at the very end and who cares if it's hardcoded like is a thing of the past anyway. Um I think there's like a bajillion dollars to make in between those two states.
Y um and I'm not even sure that their path brings you to that state first, right? Um so my argument or the way I think about competition is more like in principle versus the top labs, but it's more just maybe arrogance when it comes to hyperscalers. No, no, I no I I think that makes a lot of sense.
Um I mean OpenAI has is fighting a war on like seven different fronts right now. They're like, "Oh, we're also going to do a phone. We're also going to do a browser and we're also going to do this and that and you know like a coding environment and all this and like and that comes from the research.
How do you Yeah, I guess you're you're seven steps down the power law. Will they really build a competitor a piece of spreadsheet software like they probably could but that's pretty low on the priority stack for them I imagine.
I guess I guess h how your approach from a go to market standpoint just seems to be like create the best product possible and release it and see what happens. Is that will you have an enterprise motion over time? Do you know are you going to set up a New York office and like pound the pavement? What? What?
No, you we're going for drinks, guys. We're like Now, now listen. Um I think people have a lot of false stupid pride about like their go to market motion, right?
Like I I've mentioned cursive four times, a lot of admiration, but they like they beat the drum that like we've never spent the dollar on go to market or distribution or ads. Like that's a flex, but it's stupid, right? Like you can also have a world-class sales team.
Um when I meet with the CIOS of the biggest banks in the world, like they want to go right now. Now I need to know like what it takes to have the right SWAT team to put that together, but I don't think that comes at the expense of what we're doing from like a proumer bottoms up path. Yeah, totally.
Inevitably, one will seem to be more fruitful than the other and I will pour proportional resources towards it. I think clearly my gift is more bottoms up. Um but both be positive some for you for sure. And I don't think Windsurf had Yeah, exactly. Exactly. Exactly. presence.
Uh, deal director in the chat says, "Nico said the finance bro end of times is delayed a couple months," which I think is correct. I wanted to give them a chance to get back. Yeah, you have two months to escape the permanent underclass. Yes. Yes. Yes. So, start vibe vibe modeling. Anyway, fantastic. Always good hanging