Judgment Labs raises a Series A to help agent-native startups improve long-horizon agents from production data
May 12, 2026 · Full transcript · This transcript is auto-generated and may contain errors.
Featuring Alex Shan
Speaker 3: Great to
Speaker 2: see you.
Speaker 1: Have a great rest of your day. We'll talk to you soon. Goodbye. Up next, we have Alex Shawn from Judgment Labs, an A Star portfolio company apparently, but here to talk about a new round from Lightspeed Venture Partners. Alex, how are you doing?
Speaker 10: Doing great, guys. Thanks so much for having me on board. Kevin's great.
Speaker 2: Also nice for To open. To open for you.
Speaker 1: Yeah. Exactly.
Speaker 10: Yeah. But Some nice guys over here.
Speaker 1: He only gave us a little bit. So tell us a little bit about yourself and the company.
Speaker 10: Yeah. Happy to. And before I do that, we got a little bit of a crowd here with us.
Speaker 1: Let's hit the
Speaker 10: here out in the Office Financial District and the rest of SF, but thanks for having
Speaker 2: me Explain the orange suits.
Speaker 10: Yeah. You know, we are judgment orange. You know, it's the color of the company and we like to be proud about that. Mhmm. And so, you know, we wear it loud and proud. You'll catch us running the beta breakers race in SF, hopefully winning it with the whole team. So be on the lookout for the orange suit.
Speaker 2: It's still such a good strategy to pick a color that doesn't have a loud start up already kind of anchored around it. Yeah. Just own it.
Speaker 1: Good luck going up against Facebook blue or Coinbase blue. Yep. Much easier to stand out with a funny color yellow
Speaker 2: or Anyways.
Speaker 1: Please introduce the company. Tell us about the progress, the news, everything.
Speaker 10: Thanks a lot, guys. You know, Judgment Labs is the platform for improving long horizon agents from production data. We sort of started the company with this thought process that these autonomous long horizon agents are gonna consume the vast, vast majority of the economic value that AI is set to create across the next decades in the economy. Yep. And, you know, we're already seeing the first sets of those things happening. Developer productivity is skyrocketing at rates that I don't think anyone, including myself
Speaker 5: Yeah.
Speaker 10: Coming from a research background could have anticipated. And yet, that sort of progress is also barbelled in the sense that in many industries, we don't see the same progress on these long horizon agents. And so, typically, that's to do with a lot how verifiable or semi verifiable the outcomes are and therefore how easy it is to train those agents. But we believe that regardless of what agents you're building, the single source of truth to improve them to get to that point is going to be production data. During the run time of these agents, these long horizon agents emit so much production data ranging from their reasoning tokens to the tools they call to the retries and their memory and all the things more that are going to come out on these agents. And so that data just forms the cleanest record of how those agents behave with customers, software, and their broader environments.
Speaker 4: Mhmm.
Speaker 10: And when, you know, you process that right, we can sort of find out what users actually ask for and struggle with, which failures actually happen in production, and where agents actually find breakthroughs to solutions that we could have never predicted. And so, therefore, the goal for us as developers and sort of people that are going to bring about this agent revolution is to operationalize that production data for all these agent companies out there to create these flywheels that convert distribution advantages of, you know, people like Sam into their product modes that are going to last them for a long time. And so we've been very lucky to partner with a lot of people across the journey of the company. Coming out of Stealth today has been an amazing journey and, you know, Nova backing us at the Preseed all the way to Lightspeed backing us at the seed and Lightspeed doubling down to co lead the series a with Green Oaks.
Speaker 1: What's the sweet spot customer look like right now?
Speaker 10: So we love partnering with people who are building, you know, agent native companies and focusing on long horizon agents. And
Speaker 1: Start so I can break down ups, like scale ups, like series b companies, like what's the
Speaker 10: Primarily agent native start ups in that range of like series a to series b is our sweet spot. We focus on these companies that produce a lot of production data
Speaker 1: Yeah.
Speaker 10: And wanna figure out how to use it.
Speaker 1: And and is it particularly focused on knowledge work and sort of like the next iteration of AI agents or are you doing coding work as well or both?
Speaker 10: Definitely a lot of coding work. Okay. You know, we actually serve agents across the stack. Sam and his company, Monaco, are customers of ours. And so, different vertical agents require different versions of improvement. And so, we work across the stack but mainly on these new age long horizon agents that sort of autonomously do tasks end to end.
Speaker 1: Yeah. What what what are you helping with specifically? Because I imagine that if Monaco builds an AI SDR, an agent that goes and runs around and figures out everything about a customer and builds scripts and battle cards and all the stuff that they do, all of that's logged. They have it somewhere. Where is your value at? Like, why are they a customer? Like, are you helping them actually change production designs? Or is it more about organizing and unifying data for them to go improve their own products?
Speaker 10: Great question. Mostly the latter.
Speaker 1: Okay.
Speaker 10: So if you think about in practice, improving those agents is really challenging for teams. Okay. And so, like, most teams with all those logs that you're saying that they store have to sort of manually comb over these tables and tables of data. Okay. And so, whenever they find a failure case, the question often in the case is not just like, is this a problem? But it's, you know, how frequent does this happen? Sure. Are our most important customers affected? Yeah. What task types are most affected?
Speaker 1: Yeah.
Speaker 10: And so being able to chop up and parse this data using other agents, in fact, to sort of pinpoint the exact failure modes and therefore the exact part of, you know, an agent framework or an agent harness is exactly what we help these companies do.
Speaker 2: Interesting. Jordy? Next breakout agent category.
Speaker 1: Use case.
Speaker 2: Yeah. Coding.
Speaker 10: Yeah. You know, we tend to believe here that at Judgment, we think that the domains that are going to get solved first are proportional to those that are most verifiable.
Speaker 1: Yes. Meaning that
Speaker 2: if you
Speaker 10: can check the answer, you know, the the smallest feedback loop exactly. I would say that a lot of these domains are going be the ones that are quantitative. Know, these are things like, you know, the coding agents are easy. You can imagine the site reliability and ticket resolution agents next and then the ones that do math. But, you know, we are increasingly seeing a lot of progress in non verifiable domains as well. Stuff that you would traditionally think is not very easy to quantitatively measure, such as finance and legal and even sales to say what Sam's agents are doing are incredibly, you know, fast in terms of how the teams have been able to use their data to improve their agents.
Speaker 1: Tax and accounting seems pretty verifiable and, like, closed loop versus, you know, like, I don't know, long horizon, how did this cancer drug respond to someone over a decade? Like, it's very hard to close that loop. Well, congratulations on all the progress. Jordan, any other questions?
Speaker 2: No. This was great.
Speaker 1: Thank you so much for coming the show.
Speaker 10: Go orange. Great.
Speaker 5: We'll talk
Speaker 1: to you soon. I'm sure
Speaker 2: you'll be back on.
Speaker 1: Sounds good.
Speaker 2: Congrats to the whole team. Have a good day. Cheers.