Softinn is building foundation models for computer use, starting with synthetic data pipelines
Key Points
- Softinn is training foundation models to automate computer tasks by mimicking human behavior—clicking, typing, navigating interfaces—rather than calling APIs, targeting enterprises whose legacy software lacks reliable programmatic access.
- Synthetic data is the core constraint; Softinn generated virtually its entire dataset synthetically for its April 2025 model release and plans to blend synthetic training data with human-labeled examples and live product usage over time.
- Softinn sells API access to enterprises and startups across verticals, positioning computer-use automation as a wedge for low-complexity, high-frequency tasks like email cleanup and calendar scheduling before tackling genuinely complex work.
Summary
Noah Löfquist is building Softinn to train foundation models that operate a computer the way a human would — navigating interfaces, clicking, typing — rather than relying on structured API calls. The pitch is that a huge swathe of enterprise software either has broken APIs or no useful API at all, and the only reliable way to interact with it is through the screen.
The near-term target is low-complexity, high-frequency tasks: cleaning up an email inbox, scheduling a calendar, hunting down receipts. Cognitively simple for a human, but tedious enough that enterprises will pay to automate them. Löfquist frames these as the wedge before the model can handle genuinely complex work.
“At Softinn, we're building computer use agents. So we're training foundation models on how to use a computer like a human to really automate any type of work... A big part of it is actually the data. There's not a lot of really good data to use... Pretty much our entire dataset was just synthetically generated.”
Data as the core bottleneck
The biggest constraint isn't compute or algorithms — it's training data. Good computer-use data is scarce, so Softinn built synthetic data pipelines early. For their first model release in April 2025, virtually the entire dataset was synthetically generated. The longer-term plan is to blend synthetic data with human-labeled examples and data captured from live product usage.
Go-to-market
The initial commercial model is API-first: sell access to a strong computer-use model to enterprises and startups building products across different verticals. Consumer applications are on the roadmap but treated as a later-stage bet. Löfquist notes that no computer-use product — not ChatGPT's Operator, not Anthropic's Computer Use — has yet achieved the kind of breakout adoption that basic ChatGPT did, which suggests the market is still early and the category is still being defined.
The forward-looking architecture Löfquist describes has models fluidly switching between computer-use mode and structured API calls depending on what the task requires — not a binary choice between the two paradigms.
Every deal, every interview. 5 minutes.
TBPN Digest delivers summaries of the latest fundraises, interviews and tech news from TBPN, every weekday.