Interview

Galen Mead is building computer use base models at a research lab targeting AGI

Apr 23, 2026 with Galen Mead

Key Points

Standard Intelligence is building computer use base models by training on screen recordings, treating the keyboard and mouse as universal actuators for a fully general approach to AI agents.
Founder Galen Mead pivoted from audio models after concluding that DeepMind's game-playing training method could apply to real-world software work using vast, largely untapped video datasets.
Mead expects AI to reach the skill level of professional software users like Premiere Pro operators by end of 2025, targeting industries without their own Copilot moment yet.

Summary

Galen Mead / Standard Intelligence

Standard Intelligence is building computer use base models as part of a broader push toward AGI. Mead dropped out of university roughly three years ago to pursue full-time research and founded the company when the concept of a "neo lab" didn't really exist yet.

“We build computer use models. We're a general research company towards aligned AGI. You can train a policy on tons and tons of supervised data from diverse environments and getting something that is a base model for actions — and it occurs to us that this exists for real world work in the form of screen recordings, with a computer being the universal actuator that humans use to interact in very diverse environments.”
— Galen Mead

From audio to computer use

The company started with audio models, work Mead had carried over from an earlier project, largely as a way to demonstrate it could train state-of-the-art models and build credibility. He frames that early direction as a suboptimal strategic choice but not a serious mistake. The pivot to computer use came from a specific research intuition: the same training approach DeepMind used on games, training a policy on large amounts of supervised data across diverse environments to produce an action base model, can be applied to real-world work using screen recordings.

The core bet is that a computer is the universal actuator humans use across diverse software environments, and that screen recordings represent an enormous, largely untapped dataset for pretraining agents. Mead wants to build what he calls a base model for actions, analogous to how language base models work, but grounded in how humans actually operate software.

Why computer use is fully general

Standard Intelligence takes video as input and outputs mouse state deltas and character-level keystrokes, mirroring exactly how a human interacts with a computer. Mead argues this makes the approach fully general since no one has copyrighted the form factor of a screen and keyboard. It also sidesteps the proprietary software access problem that could otherwise constrain agent-based automation.

Timeline

Mead expects AI to reach the level of a capable Premiere Pro or Cinema 4D user by the end of 2025. He draws on his own experience doing freelance Blender animation in middle school, noting that revisiting those tools for a demo video felt like the stone ages compared to working with AI-assisted code editors. Industries that haven't yet had their "Copilot moment," in his framing, are the target.

On go-to-market, Mead is deliberately agnostic for now. The working assumption is that once the model can reliably execute complex tasks end-to-end, distribution is a download-and-pay problem.

Read full transcript →

Every deal, every interview. 5 minutes.

TBPN Digest delivers summaries of the latest fundraises, interviews and tech news from TBPN, every weekday.

You might also like...

Standard Intelligence trains computer-use models on 30fps screen capture, demos self-driving with 50 minutes of data

Feb 24, 2026

Lindy launches vibe coder that self-tests its output — the only vibe coding tool that checks its own work

Aug 27, 2025

Grammarly acquires Superhuman to build an AI-native productivity suite—and is planning a rebrand

Jul 7, 2025