Interview

ARC Prize's Mike Knoop: AI is idea-constrained, not compute-constrained — we need new breakthroughs

Jun 18, 2025 with Mike Knoop

Key Points

AI progress is now idea-constrained rather than compute-constrained, with reasoning models showing spiky domain-specific gains in math and coding but weaker transfer to legal work.
Reasoning models lack product-market fit despite lab enthusiasm because wait times and complexity create friction that most users won't tolerate.
Demand is shifting from human-labeled text to reinforcement learning environments that generate synthetic chain-of-thought traces, spawning startups like Mechanized AI, Morph, and Habitat.

ARC Prize's Mike Knoop: AI is idea-constrained, not compute-constrained — we need new breakthroughs

Summary

Mike Knoop, co-founder of ARC Prize, argues that AI progress is no longer compute-constrained. The field is idea-constrained, and the next breakthroughs require new approaches, not bigger runs.

The Pareto frontier problem

Over the past six to nine months, every major lab has shifted from scaling pretraining on labeled text toward test-time compute and reasoning models that think out loud before answering. But there is no single winner. The labs have landed on different cost-accuracy tradeoffs. Anyone quoting a single benchmark number is marketing to you, Knoop says. O3 High leads on raw accuracy if cost and latency don't matter. Gemini 2.5 Pro Thinking and Claude trade some horsepower for speed and price. The right choice depends entirely on the product context.

Reasoning model adoption presents a counterintuitive problem. Despite excitement around DeepSeek's open-access reasoning chain, these systems may have weaker product-market fit than standard language models in their current form. Wait times and complexity create friction that most users won't absorb.

“The most important takeaway is that we are idea constrained to get to AGI. This is what ARC's v1 data shows. This is what v2's data shows. V2 is completely unsaturated. We're not even talking about efficiency — just nothing can do it. Progress on the ARC Prize 2025 contest has been slower this year than it was last year.”
— Mike Knoop

Spiky intelligence and domain specialization

The original O3 paper showed a striking pattern. Its reasoning gains in math and coding were dramatically higher than in legal reasoning, even though legal work involves the kind of symbolic, self-consistent logic that should transfer cleanly. Knoop reads this as early evidence that reasoning model improvements are domain-specific rather than general. He expects benchmark scores across labs to diverge meaningfully over the next 12 to 24 months as each lab optimizes its synthetic training environments for different domains.

The RL environment wave

The training paradigm shift has direct commercial consequences. Demand for human-labeled text is declining. Labs now want reinforcement learning environments that generate synthetic chain-of-thought traces autonomously, at scale, across long-running tasks. Knoop names several startups founded in recent months specifically to build and sell these environments to frontier labs: Mechanized AI, Morph, and Habitat. The comparison to Scale AI's earlier rise is direct. Scale scaled on autonomous vehicle labeling, then pivoted as that demand peaked. Knoop expects founder-led labeling companies to recognize the RL environment shift and place bets there if they haven't already.

Every deal, every interview. 5 minutes.

TBPN Digest delivers summaries of the latest fundraises, interviews and tech news from TBPN, every weekday.

You might also like...

Mike Knoop on Gemini 3's ARC-AGI results: impressive V2 gains, but V1 'obvious mistakes' remain a mystery

Nov 18, 2025

ARC Prize launches V3 interactive benchmark and $10K agent contest to challenge frontier AI systems

Jul 18, 2025

Mike Knoop: no single AI model dominates today — we need new ideas to reach AGI

Jun 19, 2025