(Getty Images)

Judge rules Anthropic training on books it purchased was “fair use,” but not for the ones it stole

Anthropic still faces litigation for training its models on millions of pirated texts.

6/24/25 3:33PM

When AI companies like OpenAI, Anthropic, and Meta were racing to build and train new large language models, they scrambled to find enough text to train their systems on. Countless web pages, photos, YouTube videos, Disney movies, Reddit threads, and book texts were slurped up to feed the models to add billions and billions of tokens.

Resulting litigation initiated by copyright holders has shown that the legality of the process was on the minds of some AI company employees, like researchers at Meta who raised concerns while training its Llama model, only to be told that the use of LibGen, a corpus of pirated texts, was approved by “MZ.”

But yesterday, a court decided a case partially in favor of AI companies, with far-reaching consequences for all the companies that were sucking copyrighted material into their models.

A federal judge in the Northern District of California has ruled that Anthropic was not violating the copyright of authors of the books it purchased and scanned for training.

A group of authors filed the suit against Anthropic last August, alleging that Anthropic had acknowledged training its Claude AI model using “The Pile,” a mass of text shared online that contained millions of copyrighted works, including some written by the plaintiffs.

The process of buying, scanning, and ingesting the text for use in training the Claude model was determined to be “exceedingly transformative and was a fair use under Section 107 of the Copyright Act” by Judge William Alsup, a key test of the fair use doctrine in intellectual property law.

But what about the “over seven million copies of books” that Anthropic admitted were pirated that it did not pay for? The judge said that was not fair use, and warrants its own trial.

Judge Alsup wrote:

“The downloaded pirated copies used to build a central library were not justified by a fair use. Every factor points against fair use. Anthropic employees said copies of works (pirated ones, too) would be retained ‘forever’ for ‘general purpose’ even after Anthropic determined they would never be used for training LLMs. A separate justification was required for each use. None is even offered here except for Anthropic’s pocketbook and convenience.”

The case is the first of its kind to be decided in the US, and lays out a potentially legal way for AI companies to safely train their models using copyrighted works — as long as they purchase them. That said, there are still many other cases pending and many factors at play before the industry has clear rules.

But companies that are caught knowingly using pirated, copyrighted works to train AI models may face new legal exposure.

An Anthropic spokesperson told Sherwood News:

“We are pleased that the Court recognized that using ‘works to train LLMs was transformative — spectacularly so.’ Consistent with copyright’s purpose in enabling creativity and fostering scientific progress, ‘Anthropic’s LLMs trained upon works not to race ahead and replicate or supplant them — but to turn a hard corner and create something different.’”

Walt Hickey

shutdown watch

10/21/25

Traders are getting more and more pessimistic about the government shutdown ending any time soon

The market expects the shutdown to last well into November.

Jon Keegan

10/10/25

OpenAI’s internal Slack messages could cost it billions in copyright suit

Authors and publishers suing OpenAI over copyright infringement were granted access to internal OpenAI communications about the deletion of a pirated books dataset, and now want to review attorney communications.

Max Knoblauch

10/9/25

US corporate spending on private flights for executives has ballooned in recent years

The median expense spent by America’s largest companies on air travel jumped 66% from 2020 to 2024.

Claire Yubin Oh10/1/25

OK, so when was the longest shutdown in US history?

The US government officially shut down at 12:01 a.m. on Wednesday after senators failed to agree on a last-minute funding bill. Though initially shrugging off the threat of a shutdown during yesterday’s session, stocks were mildly in the red on Wednesday as investors reacted to what is now the 11th shutdown in the government’s history.

Until this latest shutdown, there had been 20 government funding gaps experienced since 1976 — though not all ended in a full shutdown, with full closure averted in half of those cases.

Indeed, prior to the 1980s, funding gaps didn’t typically have major effects on government operations, with agencies continuing to operate on the basis that the funding would come eventually. However, a more stringent interpretation of the rules led to a stricter appropriations process from the early 1980s onward, with many subsequent funding gaps resulting in a shutdown of affected agencies (unless the gaps were quickly fixed or occurred over a weekend).

Obviously, the duration of the latest shutdown is still unclear, but it will continue until Congress passes a funding bill — most likely via a “continuing resolution,” which has ended every shutdown since 1990. Data analyzed by USAFacts suggest that it might not be a one- or two-day affair, as funding gaps have lengthened in recent years.

Government shutdown patterns — Sherwood News

Indeed, the last shutdown, which began in December 2018, ended up becoming the longest in history, at a whopping 34 days. By the time the government reopened in January 2019, about $3 billion (in 2019 dollars) had been wiped from the GDP in Q4, per data from the Congressional Budget Office, with approximately $18 billion in “federal discretionary spending” delayed over the roughly five-week stretch.

Max Knoblauch9/24/25

GM climbs following upgrade, report that Trump administration seeks stake in its lithium mine partner

Shares of General Motors rose more than 2% in premarket trading Wednesday following an upgrade of the stock by UBS from “neutral” to “buy.” The firm also hiked its price target for GM by 45% to $81.

Also likely elevating GM was a Reuters report that the Trump administration is exploring taking a 10% stake in Lithium Americas, the automaker’s partner in a yet to open Thacker Pass lithium mine. Shares of Lithium Americas surged 68% in the premarket.

GM, which invested $625 million into the lithium mine last year, holds a 38% stake in the joint venture. The mine is expected to become the Western Hemisphere’s primary lithium source in 2028, when it’s slated to open, producing enough of the metal to make 800,000 electric vehicle batteries.

Prior to its plans for Lithium Americas, the Trump administration last month said it would take a 10% stake in Intel. In July, it announced a 15% stake in rare earths miner MP Materials.

Judge rules Anthropic training on books it purchased was “fair use,” but not for the ones it stole

More Power

OK, so when was the longest shutdown in US history?

GM climbs following upgrade, report that Trump administration seeks stake in its lithium mine partner

Latest Stories