Raindrop launches free open-source local agent debugger, eyes partnership with major coding platform
May 14, 2026 · Full transcript · This transcript is auto-generated and may contain errors.
Featuring Ben Hylak
Speaker 2: There was another drama in the tech world yesterday, but we'll come back to it after our next guest Ben Hylak from Raindrop joins. I believe he's in the waiting room. So we'll let him come in. He's the cofounder and CTO. We've had him on the show before. Welcome back, Ben. How are you doing?
Speaker 3: Doing well, man. How are you?
Speaker 1: Fantastic. Great to see you.
Speaker 2: What's Long
Speaker 1: time time
Speaker 2: your world. Reintroduce the company quickly and then tell us the news.
Speaker 3: Sure. Raindrop, we make observability for agents. So the main thing we do is self healing agents. So what it means is that when your raindrop hits a problem in production, we detect it, we fix it.
Speaker 2: Mhmm. How do you do it?
Speaker 3: That's a good question. So at the end of the day, like, it's we consider ourselves like the intelligence for your intelligence. Mhmm. What that means is that we are the best fastest way to essentially look at anomalies. Mhmm. So what that means is that, like, let's say you make a change. Right? We're able to very, very quickly find out that, like, oh, users all started complaining about something, or the trajectory, the traces are kind of starting to evolve into a different pattern.
Speaker 2: Mhmm.
Speaker 3: And so it's kind of a combination of agents, but also more like classic ML techniques. A lot of like custom trained models for every customer.
Speaker 2: Walk me through the shape of the agent market right now. Like, the way you're talking about it, you know, sort of illustrates the broad diffusion of agents and custom agents. I think that a lot of people think Cloud Code and Codecs. And I don't know if you're doing enterprise deals with those firms or that's the goal, but I imagine that every startup, many legacy companies have built some sort of agent, some sort of harness. And I'd love to know the shape of how broadly diffusing custom agents are in companies versus is it the domain purely of startups that create an agent for legal or an agent for sales and then they then that into a company?
Speaker 3: Yeah. So I would say that there's two kind of categories of customers. Mhmm. We started with super high growth startups, at the time startups. So those are companies like Clay, for example Sure. Framer, speak.com. Some of the fastest growing companies in the world, and those are some of our earliest customers. We're lucky they grew a ton, so, you know, that has helped our growth.
Speaker 1: Always helps.
Speaker 3: It always helps. Yeah. And someone once mentioned that, like, you know, this kind of business is a lot like early stage seed investing, actually. Yeah. It's kind of interesting. Like, you know, we you have to be pretty pretty picky not to work with companies that are gonna die. Because if like, especially analytics, like, these sort of things, like, they you you succeed as a company when your customers succeed. Like, if all of your customers are terrible, it's like everyone's like, well, why do I why are your
Speaker 1: insights a portfolio company that was working on, like, agent infrastructure, like, roughly two years ago and pivoted because he was like, okay. This is clearly gonna be a big thing someday, but right now he's looking at all the underlying companies, and he's like, I don't believe that any of these agents in their current iteration are gonna work. Now, maybe they're starting. Right?
Speaker 3: I think it was very counterintuitive at the time, but I think we chose to chose to find companies like clay.com, right, which are were clearly on a insane trajectory, but at the time were, you know, weren't necessarily as large. And so I think a lot of our customers now are pretty large, but at the time weren't necessarily as large. And then in the last few months, we've been moving into Fortune fifties, Fortune one hundreds, like, a lot of amazing things happening there. And, again, it's kind of like two shapes of a product. Like, one is, like, in our our bread and butter is, like, you know, companies that are redefining the way people, you know, interact, you know, in different verticals. But then, yeah, there are, like, Fortune fifties, Fortune one hundreds that are also deploying agents internally. I think the shape of that looks very interesting and, like, it's something that, like being on the forefront of like understanding how these companies are deploying things, like there's not that much I can talk about right now. But Mhmm. Yeah, always very interesting.
Speaker 1: What do you think is a generally underhyped agent category right now? I'm sure you're seeing the future a little bit.
Speaker 3: It's a really good question. I think that I mean, you know, I I
Speaker 6: So this is a tough question. I I What I wanna
Speaker 3: do actually is pivot the question a little bit, because I wanna talk about I wanna talk about our launch today, if that's okay. You wanna share with you guys?
Speaker 2: Yeah. Yeah.
Speaker 1: I'll tell you my questions. You tell me your answers.
Speaker 3: Okay. Okay. Sounds good. Does that mean that you want me to not answer this? No. No. No.
Speaker 1: No. I'm just I'm just messing around. Go for
Speaker 3: it. Okay. Cool.
Speaker 2: The joke
Speaker 1: is Yeah. I botched it.
Speaker 2: The joke is what questions do you have for my answers?
Speaker 6: And some CEO show up they're act
Speaker 2: like that where
Speaker 1: it's like you
Speaker 2: could ask him anything and they're just gonna I'm gonna redirect to a topic point. But it's fine. I wanna hear about the launch today, so just tell us about it.
Speaker 12: Great. Let's talk
Speaker 2: about Let's go.
Speaker 3: Okay. So, guys, there's been this crazy thing that that has been missing for a very, long time. That's why I want to talk about it. So, like, people have been building agents. Mhmm. You're building them locally. Like, you're using some sort of SDK. It could be OpenAI. Could be for cells. Whatever SDK it is. And What do you mean locally?
Speaker 2: Like, it actually has to run on, like, Or development
Speaker 3: Right. Like, before you push to production.
Speaker 5: Sure.
Speaker 3: Right?
Speaker 1: Sure.
Speaker 3: It's on your laptop.
Speaker 2: Yeah. Yeah. Yeah.
Speaker 3: There's no way to see what it's doing. Like, no standard way, nothing. Like, so people will send those traces out to, like, a server. Like, raindrop is, like, one of those, you know, and there's a bunch of others. Yeah. But they
Speaker 2: might also just, drop the logs in, like, a non relational database. Sure.
Speaker 3: They'll they'll just print it to to, you know, console dot log, like, oh, here's what was happening. It's, like, that bad.
Speaker 2: Yeah. Yeah.
Speaker 3: Yeah. And the other problem there is, like, so you can't see, like, a nice trace, or you're sending it to some server and it takes, like, seconds see everything. I'm like, whatever. It looks terrible. But then also, your coding agent can't, like, see the traces either. So then when you hit a problem and you're like, hey, you know, this response was wrong, Blogcode will just make shit up. Like, it'll just be like, oh, I think that, like, maybe this tool was wrong, or I think maybe, like, this happened.
Speaker 2: Mhmm.
Speaker 3: Because it doesn't have any of that data. Doesn't actually know what the coding agent did. Yeah. So I think that, like, as someone building agents, as, like, our company building agents, we're like it it it's actually kind of embarrassing how long it took us to solve this problem. No one else solved it either. But but, yeah, that's that's what we launched today. Free local open source tool, braindrop.ai/workshop, and it's completely free. Like, it's just open source.
Speaker 2: Why open source?
Speaker 3: That's a really good question. I mean, I think the genuine answer, I think part of, like, why our competitors, like, haven't done it. I mean, there's probably other reasons for that as well. But I think it's that it can be. Right? Like like, someone else can do it. You know what I mean? Like, I think that it it running locally is the best experience for people. Mhmm. And to be clear, like, there's still things that it enables, you know, if you connect it to your production raindrop, which is like, can pull in a remote trace and replay it. And then Clog Clog Code or Codex can just keep doing that loop until it works. So there's still benefits for us. But also, the the truth is that we want people to hack it. We want people to to meld it into whatever works for them. So we use a lot of open source things here. Right? So it makes sense to to contribute back as well.
Speaker 1: Yeah. That's great.
Speaker 2: Yeah. I'm I'm wondering about other just, like, predictions about the next breakout category of AI agents, what you're seeing. Feels like you're so close to being able to book a flight, but maybe no one wants that. I don't know.
Speaker 3: I mean, I'm not sure if you guys saw. I had a little bit of a thing with Brian Chesky earlier about Airbnb.
Speaker 6: Oh, yeah. Oh, Let's talk about that.
Speaker 3: Talk about that. No. I think, like, you know, use Airbnb a lot. I love Airbnb. I think if I had to guess, I would say Brian Chesky knows a lot more about Airbnb than I do and probably a lot more about being a founder than I do as well. True. And so I think there's probably a lot that I'm not considering. That being said, I think it's fresh like, if Airbnb had an API, I would use it, and I would book Airbnb with it, like, through Cloud Code. Right? So it's like, know I would do it. Sure. It's I find Airbnb very, very hard to search, And I think that there's a lot of I think the tough part, and, like, what I see industry wide wide right now, everyone trying to figure out, is you you see companies almost reducing themselves into an API with, like, absolutely no mode. Like, you look at, like, Photoshop, Illustrator, etcetera, they're like, oh, we have a Cloud Code integration now, MCP. At the point where people are just using Photoshop, Illustrator, etcetera, as, like, an MCP, they've they've sort of lost the game. Right? Like, if no one's actually touching the UI anymore. I think that right now, companies have to do that increasingly because they have no other choice. I think that there will be a point where the incentives don't make sense anymore. Like, I can give an anecdote from from when I was at Apple. You know, do you guys remember AppClips?
Speaker 1: Yeah. Yeah. There's more App Clips.
Speaker 2: Where did those go? I only see them
Speaker 1: Where did
Speaker 2: those go? With like parking meters sometimes.
Speaker 6: Yeah. Right.
Speaker 3: So, one of like the hero ideas there was like, oh, you know, like, imagine you're in line at Starbucks, you don't have the Starbucks app downloaded. Like Yeah. Well, why not just, you know, scan something, have an app order your drink, and it's like, turns out Starbucks doesn't want that. Right? Sure. Like, that's the last thing in the world Starbucks wants.
Speaker 1: Oh, Starbucks wants you to
Speaker 3: download the app. They want you to have stars. They have an entire Like, there's a reason why DoorDash and Uber Eats and, like, whatever, you know, God knows other apps exist. It's not because they need to, but because each have companies and money and goals and like So so why would they reduce themselves into an easily interchangeable API? It doesn't actually make sense.
Speaker 1: Yeah. But but I think it's using I think it's important to be careful around using like a tool like Photoshop interchangeably with like a retail store like Starbucks or like a marketplace like Airbnb or DoorDash because I I really think that these marketplaces provide, you know, exceptional amount. All the value is not in the UI. Right? Agree. I agree. And where and and like the value of Starbucks is not that it's a pretty app. It's because they have Yeah. Specific drinks that they can make pretty much anywhere, you know, someone would be
Speaker 2: Yeah. I think of that company, Buy, the the drink company. They got started during the direct to consumer boom. Obviously, they would have some beautiful Shopify website. They didn't. They just went Yeah. Direct to retail and they had Amazon. You could order it online and if you went to their website, would just say go to Amazon. And they were and they did fine. A billion dollar company. Yes. And because like the value is not in the e commerce experience, they didn't play like the the Yep. Stars game. Of course, Starbucks is maybe sacrificing a piece of that business model, but it's not giving away the whole cow. I don't know.
Speaker 3: The There going to be ways to monetize this. Right? Like, there are going to be successful business models built on top of this sort of layer. And and to be honest, as Raindrop goes into the future, like, that's the future we're building towards. That's the future we want. I mean, we're gonna be announcing a partnership with a really large one of the, like, the large coding companies as soon as far as, like, integrating with them more, where it's like, I don't see Raindrop as a company that's going to submit PRs and production to people's code bases. Like, someone else is gonna be doing that. We're gonna be the layer that's really good at finding those issues, diagnosing them, and tracking them.
Speaker 2: Yeah.
Speaker 3: So I just think it's going to be interesting into the future how much companies are willing to sort of just like be the API with like without the all those hooks, without the, you know, knowing everyone's email, like having the mailing list, like all that sort of stuff. Yeah. So that that's a very interesting trend.
Speaker 2: I feel like you're generally on the frontier and cutting edge of like adopting all these tools. You mentioned your cloud code use and I'm wondering about give me a a reality check, a health check on your experience Yeah. Computer use because you're lamenting the fact that Airbnb doesn't have an API And Yes. I imagine you could create a scraper or download the HTML and interact with it, like treat the front end as the API, effectively puppeteer the computer through computer use. Like where are you on like the AGI moment in computer use from what you've seen? Like, where does it work? Where doesn't it work? Where would you recommend people get started if they want to play around with it?
Speaker 3: Yeah. That's a really good question. There's other places where it works. Like, I think that Codex has done a very good job of implementing, like, browser use actually Yeah. Both for, like, debugging applications that you're working on and in general. Like, this is something that QuadGo just, like, doesn't do. Like, creates a really again, that kind of, like I think the next couple months, the thing you're going to keep hearing from me, but also everyone in the world is like, self healing loops, loops, loops, loops. Right? Like, how do you create loops where it's how do you close the loop? Yep. How do you have, like, flawed code, make a UI change, see that it sucks, and then just keep going. Right?
Speaker 13: Yep.
Speaker 3: And a lot of like, do we have AGI or not, is how many loops in a row can you do It's all loops. Before Right? Before things just end catastrophically. Right?
Speaker 2: Yeah.
Speaker 3: Because there is sort of this like Yeah. It gets worse, right? Yeah. In in many cases. So, I think, like, there's a lot of, like, ways to answer this. Like, I'm a fairly, like, security conscious person. I think that, like, the, you know, I'm not, like, an open claw guy. I'm not gonna give all of my, like, cookies to some, you know, agent, etcetera, etcetera. But yeah, I think Okay. TBD. Yeah.
Speaker 2: Yeah. Cool. Well, yeah, new challenge. Book an Airbnb with an agent. Can it be done? Yes. Is that where the goalposts need to be set? Let's figure it out. Anyway, thank you so much for coming on the show.
Speaker 6: To see you, Ben.
Speaker 1: Congrats to the team. Congrats launch.
Speaker 3: Course. Last thing, if you want a hat, you can we have a new CLI. You can run Raindrop Drip. Can get a hat and umbrella, couple of other
Speaker 2: Oh, that's a fun that's a fun way to give out I like that. That's very creative.
Speaker 1: For Great on the stuff. Will talk to you soon. Bye.
Speaker 2: Okay. Back to the debate around Figure. We've had Brett Adcock on the show before and he had a livestream. We talked about it a little bit. Watch a team of humanoid robots running a full eight hour shift at human performance levels. And Brett Adcock said this is fully autonomous running Helix two.
Speaker 1: Alright. Pull up pull up this post. The From Pete.
Speaker 2: Yes. And the the stream did fantastically. It was twenty four hours. It got 3,400,000 views. But at a certain point during the stream, there was some questions about whether or not the humanoid robot was
Speaker 1: Alright. Fact Back to the beginning. Back to the beginning. Okay. Let's play this. Alright. So it's cooking. I mean, the speed is actually
Speaker 2: extremely impressed by this. This was remarkable.
Speaker 1: Remarkable. Even if it's teleoperated, it's extremely impressive.
Speaker 2: Yeah. Yeah. Yeah. Like the robot's clearly working.
Speaker 1: This is very saying that it's not teleop.
Speaker 2: Okay. So then the robot starts missing things being a little bit like an inch off and then reaches up and touches the robot's head, the robot, which is something that wouldn't normally be necessary. It doesn't have like a logical explanation or conclusion. So a lot of people are It
Speaker 1: does have a semi logical conclusion Brett is claiming when it reaches across its body to go to the right that it puts its hand up here to get the hand out of the way.
Speaker 2: That's what I was thinking was that if the hand is is halfway up, It's way back blocking to the the sensor, the camera sensor. And so even though you like you might the the robot might reach the hand up further to move out of the view so then the robot can look at the next package. So that's one possible explanation. But a lot of people are asking even harder questions saying that potentially was there a human in the loop? Was this teleoperated? Which is something Brett has said it's fully autonomous. I feel like that means no humans in the loop. But Tior Taxes has an artist's representation of Helix two figures in house neural network running entirely on board. And it, of course, is a human in a VR headset. Very, very debatable. We'll let we'll see where you stand. But there is there is a third option which I have shared, which is potentially no humans involved. I don't know if you'd call it autonomous, but you would call it no humans in
Speaker 1: the loop
Speaker 2: because you have
Speaker 1: Well, is an autonomous system. Right? It just sort of runs.
Speaker 2: Yeah. I would I would consider this autonomous. It's the it's the image that I shared in the production chat. It's not of a human and it's not quite robotic but there's no human in the loop. And so this could explain it's the system is running with no humans in the loop. If you make that claim
Speaker 1: Running and you follow this,
Speaker 2: I think this qualifies as no humans in the loop. If you have a giant orangutan in a VR headset puppeteering the robot via teleoperation, you could say that this system is does not have a human in the loop. And you could make that
Speaker 1: And I could make the argument that it's autonomous. Yes. The chimpanzee is running its own it has somewhat of a neural neural Yes.
Speaker 2: Yes. No, no, no.
Speaker 1: No one knows. And Bill says I think there was a human physically inside. Oh.
Speaker 2: Physically inside. As a option.
Speaker 1: Yeah. Mean the the Red, Chime in. The thing that I'm so I I wanna talk with somebody at a place like Amazon
Speaker 2: Yeah.
Speaker 1: Who I imagine does this kind of thing all day long.
Speaker 2: Yeah.
Speaker 1: And are they asking for a humanoid to do this process? Yeah. Like, this seems like something that that e commerce fulfillment Yeah. And logistics companies have been doing for many many many many years. Yes. Is there not a purpose built robot that sits right there and makes sure that Yes. The packages are in the right orientation? Does it have to, you know
Speaker 2: Yeah. If you watch an episode of How It's Made, you will see every variety of of custom made machine for flipping around, sorting packages, that type of activity. There are custom built machines that run at scale. They might cost like $10,000 but they last fifty years. And anytime you see you know, a Diet Coke factory or gum manufacturing line, all these things like the gum that you have there comes off and the gum rattles down and is sorted into the pockets of the the the packaging and then the sleeve is wrapped around and glued and all of that is done autonomously but just with, you know, a bunch of machinery that was built in probably like a hundred years ago honestly. If it works, don't fix it. But you can clearly see see how this type of task package sorting would be like on the curve to a more economically valuable humanoid robot. And like if I was going to buy a humanoid robot to do my dishes and you showed me this video and it was in fact fully autonomous, that would be an encouraging demo to me. That would be something that I would look at and say, oh, well, like if it can do this successfully for hours and hours and hours, I'd probably trust it to put some laundry in the washing machine. That doesn't seem well beyond the scope of capabilities. It's so
Speaker 1: interesting how quick it is Yeah. When it's just sorting packages there and then it doesn't divide and walk on the way off.
Speaker 2: Yeah.
Speaker 1: If you rewind for a second Yeah.
Speaker 2: Yeah. Yeah. The walk is not Walk.
Speaker 1: I only use that terminology because that's the terminology that
Speaker 2: That is used. Yeah.
Speaker 1: Like, look. Why does
Speaker 3: Yeah. Look
Speaker 2: like that. If you were able to shuffle like this so fast and so fast, you'd think that you'd be able to hustle a little bit. But maybe that's v two. Maybe that's less less relevant for this particular task. You know, there's a lot of different options but we will dig into it. Brett launched day two. I mean, putting up views, sorted 32,000 packages. Day two is live. And he shared more details on what's going on. The original goal was an eight hour run. After zero failures yesterday, we decided to keep going. We're now over twenty four hours of continuous autonomous operation without failure. This is uncharted territory. The task is small package sorting. F dot zero three detects the barcode, picks up the package and reorients it barcode face down onto the conveyor. Humans average around three seconds per package. F zero three is now around human parity. The robots are reasoning directly from camera pixels. The robots are fully autonomous using Helix two, our in house neural network running entirely onboard f zero three. There's no teleoperation. Every action comes directly from Helix zero two. Okay. Well, I feel like that rules out the monkey business. I think think teleoperation I would fall. If if you had a monkey puppeteering this thing, think it would count as teleoperation. So he is denying that allegation from the timeline. Yeah. But the timeline seems convinced. YouTuber commenters started naming the robot Bob, Frank, Gary, so they added name tags to each robot. And if the robot gets stuck or the AI policy goes out of distribution, Helix triggers an automatic reset. You'll occasionally see this happening during the livestream. If a robot or soft has a software or hardware issue, it autonomously leaves for maintenance and another robot takes over. We run our labs and figure this way to maximize uptime. If we haven't had a failure yet we haven't had a failure yet, but statistically we probably will at some point. So very, very fun going back and forth. Who else is chiming in? People are the last Dara says, I'm the last person I'd expect to rush to Figure's defense and I'm looking forward to hearing Brett's take here and in here and in any and all cases. I stand with PBD King. But IMO, this demo seems authentically autonomous and could see this being learned behavior from teleoperators that collected the data for this model with their VR headsets. And PBD sucks, who broke the story or went viral first time said he actually has a pretty reasonable sounding excuse but doesn't give me tons of confidence on the model's brittleness. For cross body research for cross body reach the policy lifts its arm to avoid hitting the metal. Shoot. Nice try. I wasn't I wasn't sure if he was gonna if he was gonna reply to this and sort of engage or just sort of let the let the timeline run wild with it. But the metal plate does seem like a piece of what's going on, but people are still hungry for teleoperation bombshells. It sort of cuts both ways. I remember Jason Carmen did a a video maybe with one axe and and everywhere in the video, they said this is teleoperation. We're doing teleoperation. We're bullish on teleoperation. Put it at the bottom in the text and the description, like, so told everyone. And still people were quote tweeting and being like, this is teleoperation. And so people, you know, are sort of grappling with like what is real, what is fake constantly. Well, is there anything else on the figure story that you'd like to dig through? No. Switch his hands after working more than four hours straight. Well, we
Speaker 1: can is some We've dug a lot of in just a few minutes. A new a newly released OGE form, Office of Government Ethics two seven eight t, discloses that president Trump filed 3,642 trades involving stocks of public companies between January 1 and March 31. Transactions include hundreds of stocks and ETFs such as Nvidia, Microsoft, Broadcom, Amazon, Apple, Alphabet, Meta, Goldman Sachs, AMD, Airbnb, Palantir, Netflix, Costco, Walmart, JP Morgan, DoorDash, and others. Individual purchases of Nvidia, Microsoft Broadcom, Amazon Individual. So he's averaging around roughly 40 trades a day.
Speaker 2: 40 trades a Check
Speaker 1: my math there. Okay. That is in q one. It's a lot of
Speaker 2: trading activity. We talked about this.
Speaker 1: Selling. Should
Speaker 2: you just give Jane Street right access to the federal, you know, government? Should they just be able to change the laws to optimize for max GDP growth? And feels like we're one step closer one step closer to the the economic singularity of the hedge fund running the country.
Speaker 1: Anyway, we have another
Speaker 2: yeah. What else?
Speaker 1: I'm trying to find the history of presidential day trading.
Speaker 2: I don't know if there is one. Jimmy Carter famously divested from his peanut farm because he was worried about conflicts of interest but we are in a new era. Anyway, we'll have to figure out if Trump is long or short the Cerebras IPO. He's probably watching right now to hear Doug O'Laughlin's take on it to understand what's
Speaker 1: George w Bush.
Speaker 2: With Cerebras.
Speaker 1: Catch up with Dee says
Speaker 2: Yes.
Speaker 1: Not a day trader.
Speaker 2: Oh, had
Speaker 1: a famous controversial stock sale. He sold 200,000 Harkin Energy shares in Okay. 1990 before bad news came out.