🟡 Active Litigation — No Settlement Yet

Authors Guild v. OpenAI
What Authors Need to Know

OpenAI trained its GPT models on millions of copyrighted books — without permission, without payment. Multiple lawsuits are active. A settlement could be worth as much or more than Anthropic's $1.5B.

Status: Discovery Phase
Courts: SDNY + N.D. Cal.
Settlement: Not yet announced

Case Overview

OpenAI trained its GPT family of models — including GPT-3, GPT-4, and subsequent versions — on massive datasets that included millions of copyrighted books. Among the most well-documented sources is Books3, a dataset curated by researchers that contained approximately 196,000 full-text books scraped from shadow libraries and other sources without licensing. Books3 was also included in a larger dataset called The Pile, an 800GB collection assembled by EleutherAI.

In 2023, a wave of copyright lawsuits began. The Authors Guild, one of the most prominent author organizations in the United States, filed suit along with bestselling authors including John Grisham, George R.R. Martin, Jodi Picoult, David Baldacci, and Jonathan Franzen. Penguin Random House and other major publishers have also filed related actions.

The cases allege that OpenAI reproduced, distributed, and used copyrighted books as training data without licenses or compensation — a direct violation of the Copyright Act. Unlike the Anthropic case, which focused on a specific download event (August 10, 2022), the OpenAI cases involve a more complex picture of ongoing data collection and use.

Current Status

⚖️
Discovery Phase
As of April 2026, the cases are in active discovery. Plaintiffs are gathering evidence about which datasets OpenAI used and how they were compiled. No settlement has been announced.
🏛️
Multiple Venues
Cases are active in the Southern District of New York and the Northern District of California. The cases may be consolidated or proceed separately.
🚫
No Settlement Announced
There is no settlement to claim right now. Join our waitlist and we will notify you immediately when material developments occur.

What We Know So Far

📖
Datasets Used
Books3, The Pile, Common Crawl, and internal proprietary datasets. Books3 contains ~196,000 full-text books from shadow libraries. The Pile includes Books3 plus other copyrighted material.
👥
Authors Who Sued
John Grisham, George R.R. Martin, Jodi Picoult, Jonathan Franzen, David Baldacci, Scott Turow, and many others via the Authors Guild. Penguin Random House also filed related actions.
📏
Estimated Scope
Potentially hundreds of thousands of qualifying works, similar in scale to the Anthropic settlement. The Books3 dataset alone covers ~196,000 books; broader OpenAI training data likely includes far more.
💰
Settlement Potential
If the Anthropic case ($1.5B for ~400K books) sets a benchmark, an OpenAI settlement could be comparable or larger, given OpenAI's higher valuation and broader training data use.

Why This Matters

The Anthropic settlement proved that AI companies can be held financially accountable for using copyrighted books without permission. OpenAI is a larger company with broader usage of training data — and faces an even larger coalition of plaintiffs.

If OpenAI reaches a settlement, it could dwarf the Anthropic case. Authors with books in Books3 or other datasets would be potential claimants. The same logic applies: the earlier you prepare, the better positioned you'll be when claims open.

Being on our waitlist costs nothing. We'll alert you the moment a settlement is announced, give you guidance on filing claims, and offer to purchase your claim for immediate cash if you prefer not to wait.

Be First to Know

Join our waitlist and we'll notify you the moment an OpenAI settlement is announced, claims open, or material case developments occur. No spam — just updates that matter.

FAQ

Is there an OpenAI settlement I can claim?

Not yet. As of April 2026, OpenAI has not reached a settlement in any of the major copyright cases filed by authors or publishers. The cases are in the discovery phase. If and when a settlement is reached, we will notify everyone on our waitlist.

Which books did OpenAI use for training?

OpenAI's GPT models were trained on large internet crawls including Books3 (a dataset of ~196,000 copyrighted books), The Pile (an 800GB text dataset that includes Books3 and other copyrighted material), and Common Crawl. The Books3 dataset in particular has been central to multiple copyright lawsuits.

Who has sued OpenAI over copyright?

Major plaintiffs include The Authors Guild, Penguin Random House, and individual authors including John Grisham, George R.R. Martin, Jodi Picoult, David Baldacci, and Jonathan Franzen. Multiple cases have been filed in the Southern District of New York and the Northern District of California.

How would an OpenAI settlement compare to Anthropic's?

It's impossible to predict exactly, but the scale could be similar or larger. OpenAI has a higher market valuation than Anthropic, and the number of books allegedly used in training is comparable. The Anthropic settlement valued at $1.5B may set a floor for what OpenAI authors could expect.

What should I do right now?

Join our waitlist. We'll notify you the moment a settlement is announced, claims open, or material developments occur. In the meantime, you can also check if your books qualify for the Anthropic settlement, which is already in the distribution phase.

Have books that may qualify for the Anthropic settlement?

That settlement is already approved and moving toward distribution. Check if your books are on the Anthropic Works List.

Explore Bartz v. Anthropic →