Interview

Anthropic agrees to pay $1.5B in largest-ever copyright settlement over AI training on pirated books

Sep 5, 2025 with Cecilia Ziniti

Key Points

  • Anthropic settles Authors Guild copyright lawsuit for $1.5 billion, the largest copyright recovery in history, covering use of pirated books from LibGen to train Claude through August 2025.
  • Court ruling establishes clear line: acquiring books through legitimate channels is permissible; sourcing from pirated libraries is not, setting precedent for AI training data practices.
  • Settlement positions licensing infrastructure as the industry's path forward as LLMs compete for high-quality content, with the ASCAP music model as potential template for resolving data rights disputes.
Anthropic agrees to pay $1.5B in largest-ever copyright settlement over AI training on pirated books

Summary

Anthropic has agreed to pay $1.5 billion to settle a copyright lawsuit brought by the Authors Guild, making it the largest publicly reported copyright recovery in history. The settlement, filed as an unopposed motion, covers non-fair use of copyrighted books used to train Claude, specifically limited to past infringement through August 2025.

Cecilia Ziniti, a technology attorney, walked through the structure. The case centered on Anthropic's use of LibGen, a pirated book library she describes as a Napster-equivalent. Authors whose works appear on a named works list will receive at least $3,000 per work, administered through a court-appointed administrator. Ziniti notes the settlement draws a clear legal line: buying books from a used bookstore and scanning them was ruled permissible; sourcing from a pirated library was not.

Critically, the settlement does not touch AI outputs at all. There was no allegation that Claude reproduced or substituted for the original books. The case was strictly about the training data acquisition.

One detail from the court's findings: Anthropic had hired someone from Google's book-scanning project who was reportedly tasked with "obtaining all the books in the world while avoiding as much legal practice business slog as possible." That framing, Ziniti notes, speaks to just how deliberately and operationally LLM companies have approached large-scale data acquisition.

The NYT case and what comes next

The plaintiff's firm, Sussman Godfrey, also represents The New York Times in its ongoing case against OpenAI. Ziniti thinks that case could go all the way to the Supreme Court. The New York Times is one of the few plaintiffs with a strong enough content library, deep enough pockets, and direct copyright ownership to pursue that path.

Ziniti frames the broader trajectory as a Napster-to-iTunes moment. Whether the resolution comes through litigation, legislation, or commercial deals, she expects a licensing infrastructure to emerge, potentially resembling the ASCAP model for music. OpenAI and Anthropic have already begun cutting data deals, and Anthropic's settlement arguably sets useful precedent: training on legitimately acquired books is fine; pirated sources are not.

For Anthropic, the settlement lands as a knowable cost rather than an existential one. The company carries a $183 billion valuation and recently raised another billion. The $1.5 billion covers a bounded period and a specific infringement category, letting the company move forward with cleaner legal footing on training data practices.

As LLMs exhaust the open web and compete for high-quality new content, how the industry resolves data rights will determine what gets built next. The Anthropic settlement sets one data point on what piracy-adjacent training is worth, at least in the Ninth Circuit.