Galen Mead is building computer use base models at a research lab targeting AGI
Apr 23, 2026 with Galen Mead
Key Points
- Standard Intelligence is building computer use base models by training on screen recordings, treating the keyboard and mouse as universal actuators for a fully general approach to AI agents.
- Founder Galen Mead pivoted from audio models after concluding that DeepMind's game-playing training method could apply to real-world software work using vast, largely untapped video datasets.
- Mead expects AI to reach the skill level of professional software users like Premiere Pro operators by end of 2025, targeting industries without their own Copilot moment yet.
Summary
Read full transcript →Galen Mead / Standard Intelligence
Standard Intelligence is building computer use base models as part of a broader push toward AGI. Mead dropped out of university roughly three years ago to pursue full-time research and founded the company when the concept of a "neo lab" didn't really exist yet.
“We build computer use models. We're a general research company towards aligned AGI. You can train a policy on tons and tons of supervised data from diverse environments and getting something that is a base model for actions — and it occurs to us that this exists for real world work in the form of screen recordings, with a computer being the universal actuator that humans use to interact in very diverse environments.”
From audio to computer use
The company started with audio models, work Mead had carried over from an earlier project, largely as a way to demonstrate it could train state-of-the-art models and build credibility. He frames that early direction as a suboptimal strategic choice but not a serious mistake. The pivot to computer use came from a specific research intuition: the same training approach DeepMind used on games, training a policy on large amounts of supervised data across diverse environments to produce an action base model, can be applied to real-world work using screen recordings.
The core bet is that a computer is the universal actuator humans use across diverse software environments, and that screen recordings represent an enormous, largely untapped dataset for pretraining agents. Mead wants to build what he calls a base model for actions, analogous to how language base models work, but grounded in how humans actually operate software.
Why computer use is fully general
Standard Intelligence takes video as input and outputs mouse state deltas and character-level keystrokes, mirroring exactly how a human interacts with a computer. Mead argues this makes the approach fully general since no one has copyrighted the form factor of a screen and keyboard. It also sidesteps the proprietary software access problem that could otherwise constrain agent-based automation.
Timeline
Mead expects AI to reach the level of a capable Premiere Pro or Cinema 4D user by the end of 2025. He draws on his own experience doing freelance Blender animation in middle school, noting that revisiting those tools for a demo video felt like the stone ages compared to working with AI-assisted code editors. Industries that haven't yet had their "Copilot moment," in his framing, are the target.
On go-to-market, Mead is deliberately agnostic for now. The working assumption is that once the model can reliably execute complex tasks end-to-end, distribution is a download-and-pay problem.
Every deal, every interview. 5 minutes.
TBPN Digest delivers summaries of the latest fundraises, interviews and tech news from TBPN, every weekday.