Interview

Fal raises $140M Series D led by Sequoia as image editing surpasses AI video in revenue

Dec 9, 2025 with Gorkem Yurtseven

Key Points

  • Fal raises $140 million Series D led by Sequoia, its third fundraise in 2025, as demand for AI inference infrastructure accelerates.
  • Image editing unexpectedly surpassed AI video as Fal's largest revenue driver after stronger models arrived in May 2025, serving advertisers, retailers, and studios.
  • AI video remains commercially immature due to missing character and scene consistency; Fal bets the category repeats image editing's trajectory when model quality converges in 2026.
Fal raises $140M Series D led by Sequoia as image editing surpasses AI video in revenue

Summary

Fal, the AI infrastructure platform, closed a $140 million Series D led by Sequoia Capital, with meaningful participation from Kleiner Perkins and Nvidia. The round is the company's third fundraise in 2025 alone, signaling sustained investor appetite for inference and model-serving infrastructure.

Image editing has overtaken AI video as Fal's largest revenue driver, a result that surprised the company's own leadership. Fal entered 2025 expecting AI video to carry growth, and while that segment performed strongly, image editing emerged as the dominant category after the first compelling editing models arrived around May 2025. Customers generating on the platform span major advertisers, retail platforms, design and productivity apps, and movie studios.

The generative media model market is structurally more fragmented than the LLM space. The leading image model shifts frequently, with Google's Imagen 3 (internally code-named Nana Banana, using a smaller Gemini model as the underlying architecture) currently holding the top position for image editing. Open-source communities and Chinese labs have consistently closed gaps within months of frontier releases, partly because seeing market demand clarifies what to build, and partly because research techniques propagate quickly.

AI video remains pre-mainstream on the commercial side. Producing a high-quality AI video ad currently requires more effort than a conventional shoot, limiting adoption to technically proficient creators. Cling, developed by a Chinese lab, is identified as one of the stronger current video models, with recently announced editing capabilities drawing attention. The critical missing feature across video models is character and scene-to-scene consistency, viewed as the unlock for broad commercial adoption in 2026.

Fal's go-to-market thesis is that the image editing trajectory will repeat in video as model quality and usability converge, pulling in studios and retailers at scale. The company is positioning its platform to capture that volume regardless of which underlying model leads at any given moment.