Interview

Guy Gur-Ari of Augment Code on GPT-5's careful tool use and automating the software development lifecycle

Aug 7, 2025 with Harish Abbott

Key Points

  • GPT-5 excels at complex, large-scope development tasks through deliberate tool use and clarifying questions rather than immediate code execution, making it suited for background processing on enterprise codebases.
  • Augment Code is expanding beyond IDE-focused coding tools into full software development lifecycle automation, automating code reviews, incident response, and ticket assignment from production logs.
  • Current AI agents degrade significantly on projects exceeding tens of thousands of lines due to poor architectural decisions that compound over time, requiring close human supervision on design choices.
Guy Gur-Ari of Augment Code on GPT-5's careful tool use and automating the software development lifecycle

Summary

Guy Gur-Ari, co-founder and chief scientist of Augment Code, offers a measured but optimistic read on GPT-5's practical utility for enterprise software teams. After several weeks of internal trials, his assessment is that GPT-5 distinguishes itself through deliberate, high-volume tool use — making extensive calls before touching code and proactively asking clarifying questions. That behavior makes it best suited for complex, large-scope tasks rather than quick iterations, with users expected to run it in the background and return for results.

Augment Code's core product targets large engineering teams working on large codebases, handling question answering, refactoring, migrations, and agentic development with deep codebase context built in. Gur-Ari says the company develops its own tool integrations rather than relying on model vendors, and has worked directly with OpenAI to optimize prompting around those tools. His one near-term capability request is better native screenshot support, which he frames as the front-end equivalent of automated test execution — a feedback loop that allows agents to iterate to working UI code the same way back-end agents iterate to passing tests.

Automating the Software Development Lifecycle

The more consequential signal from Augment Code is what Gur-Ari describes as the early stages of software development lifecycle automation. The company's CLI tool exposes its context engine and agent outside the IDE, and developers are beginning to use it to automate code reviews, incident response, production log analysis, and automatic ticket assignment from error logs. That represents a meaningful expansion beyond the IDE-centric "inner loop" that has defined most AI coding tool development to date.

On productivity, Gur-Ari cites senior developers running multiple agents in parallel as capable of achieving 10x or greater output gains. That figure aligns with claims circulating broadly across the AI coding sector, though Augment Code's emphasis on large-team, large-codebase context differentiation is the stated mechanism for sustaining those gains at scale.

Where Agents Still Break Down

Gur-Ari's most concrete cautionary note concerns architecture and design decisions. Fully autonomous "vibe coding" produces working code in early stages but degrades significantly once projects reach the low tens of thousands of lines, as poor structural decisions compound and slow development. He argues that all current agents require close human supervision over architectural choices, and frames this as the highest-stakes remaining gap — one that could take at least a year to close. Cybersecurity applications, specifically creative offensive security work like white-hat penetration testing, are flagged separately as another area where current model capabilities fall short.