“Courts are adapting the flexible fair-use doctrine to modern technology without rewriting the statute.”
In 2025, three federal courts finally confronted a question that had hovered over artificial intelligence for years: can machines legally learn from copyrighted works? Each opinion—Thomson Reuters v. Ross Intelligence, Bartz v. Anthropic, and Kadrey v. Meta Platforms—applied the four-factor fair-use test under 17 U.S.C. §107 to large-scale model training. Together, they form the first real framework for evaluating how copyright interacts with machine learning.
The results point toward a single principle: when AI training reproduces a copyrighted work’s market function, it fails fair use. When the training is analytical—using works as data rather than expression—it passes. The following cases mark where courts are now drawing that line.
1. Ross Intelligence: Copying to Compete
In Thomson Reuters v. Ross Intelligence (D. Del. 2025), Westlaw accused Ross of using its headnotes and key-number system to train a competing legal-research AI. Ross’s contractor, LegalEase, created more than 25,000 “Bulk Memos” built from Westlaw content. Those memos became Ross’s training corpus, allowing its AI to return results that mimicked Westlaw’s summaries and topic hierarchy.
Judge Stephanos Bibas found that Ross’s conduct went beyond learning. The company “built its competing product using Bulk Memos, which in turn were built from Westlaw headnotes.” That functional overlap was decisive.
Bibas’s Four Factor Fair Use Analysis
- Purpose and Character. The court held that Ross’s copying was commercial and not transformative. Its system served the same purpose as Westlaw—legal research—and targeted the same customers. Intermediate copying cases like Google v. Oracle or Sony v. Connectix did not apply because Ross’s use was competitive, not compatibility-driven.
- Nature of the Work. The Westlaw headnotes were creative editorial summaries, not raw facts. This factor weighed against fair use.
- Amount and Substantiality. Although Ross copied thousands of headnotes, the court focused on what the public ultimately received. Because the AI did not display the headnotes directly, this factor slightly favored Ross.
- Market Effect. Judge Bibas called this “the single most important” factor. Ross’s model competed directly with Westlaw’s subscription market, defeating any fair-use claim.
Outcome
Three of four factors—purpose, nature, and market effect—went against Ross. Only the third factor offered partial support. The result was straightforward: using a rival’s proprietary database to build a substitute product is infringement, not innovation. Ross now anchors the limit of permissible AI training.
2. Bartz v. Anthropic: When Learning Is Transformative
Only months later, Judge William Alsup reached the opposite conclusion in Bartz v. Anthropic PBC (N.D. Cal. 2025). Authors alleged that Anthropic’s Claude models copied their books from “shadow libraries” during training. Anthropic responded that the training process extracted statistical patterns about language and style; it did not store or output the books’ expressive content.
Alsup’s Four Factor Fair Use Analysis
- Purpose and Character. Alsup called the first factor “the center of gravity” in technological cases. Claude’s use was “quintessentially transformative” because the system converted expressive text into numerical weights. Like a writer who studies many books before creating new ones, the model learned linguistic structure rather than reproducing expression. Although Anthropic’s business was commercial, the transformative purpose outweighed that concern.
- Nature of the Work. The plaintiffs’ novels were creative, which ordinarily weighs against fair use, but the court gave this factor little weight because the use was analytical, not expressive.
- Amount and Substantiality. Anthropic copied the works in full, but the court found this necessary to achieve the transformative goal. Partial copying would have crippled the model’s ability to learn syntax and semantics. Because the system did not expose the works to users, factor three favored Anthropic.
- Market Effect. The plaintiffs offered no empirical proof that Claude displaced sales of their books or harmed any existing licensing market. Alsup rejected hypothetical “training-data licensing” markets as too speculative. With no evidence of substitution, this factor “weighs heavily in favor of Anthropic.”
Outcome
Three of four factors—purpose, amount, and market effect—favored Anthropic. The second factor mattered little. Judge Alsup granted summary judgment for the company, holding that using copyrighted works as input for analytical learning qualified as fair use. Bartz thus established the first federal recognition of large-scale model training as transformative use.
3. Kadrey v. Meta: Transformative Function Confirmed
Two days after Bartz, Judge Vince Chhabria issued a companion opinion in Kadrey v. Meta Platforms Inc. (N.D. Cal. 2025). Authors including Richard Kadrey and Sarah Silverman alleged that Meta trained its LLaMA 2 and 3 models on pirated copies of their novels. Meta conceded that full works were ingested but argued that training was purely analytical: the models extracted linguistic patterns and generated new text, not copies.
Chhabria’s Fair Use Four Factor Analysis
- Purpose and Character. Judge Chhabria found Meta’s purpose “entirely new and different.” Training LLaMA on books taught it relationships among words and syntax—the same reasoning adopted in Bartz. He compared the process to Google Books, where scanning entire works to enable search was transformative. Although Meta profited from the resulting models, the training itself served an educational, non-expressive purpose. Factor one favored Meta.
- Nature of the Work. The plaintiffs’ novels were creative, but the court found that creativity had “diminished significance” because the copying targeted language structures, not expression. This factor slightly weighed against Meta but carried little weight.
- Amount and Substantiality. Complete copying was “technologically necessary.” The models needed full context to learn patterns. Because users never received expressive text, factor three favored Meta.
- Market Effect. Plaintiffs failed to show lost sales or measurable substitution. Their theory that AI-generated fiction diluted demand was “unsupported speculation.” The court also declined to recognize a hypothetical training-data market. Factor four favored Meta.
Outcome
Three factors supported fair use, one was neutral. Kadrey reinforced Bartz: analytical ingestion of copyrighted works for machine learning is transformative and permissible when there is no market harm. With two consistent rulings from the Northern District of California, the courts signaled growing consensus on the legality of data-driven training.
Doctrinal Patterns Across the Trilogy
Although each court applied the same four statutory factors, the outcomes diverged along a simple axis: Ross involved substitution; Bartz and Kadrey involved learning. From those decisions, several guiding principles now shape the fair-use landscape for AI.
Transformation Depends on Function
Courts now focus on what the copying does, not how it looks. When a system ingests expressive works to compute statistical relationships rather than to reproduce text, the purpose is transformative. Ross failed this test because its AI served the same research function as Westlaw. Bartz and Kadrey passed because their models used books as linguistic data, not as market substitutes.
Intermediate Copying Can Be Lawful
Following Google v. Oracle and Sony v. Connectix, courts accept complete copying when it is technologically necessary and non-expressive. The end user must never receive the protected content. Following suit, both Bartz v. Anthropic and Kadrey v. Meta treated wholesale ingestion as acceptable intermediate copying.”Market Harm Requires Evidence
Each decision reiterated that the fourth factor dominates. Speculative licensing markets or generalized fears of dilution are insufficient. Without empirical proof of substitution, fair use prevails. Ross offered concrete competition; Bartz and Kadrey did not.
Creativity Still Matters—but Less
The creative nature of novels or editorial summaries remains relevant but no longer decisive. In analytical or functional contexts, courts treat creativity as a weak factor that yields to transformation and market analysis.
A New Boundary for Fair Use
The trilogy collectively draws a pragmatic boundary. AI training that competes with a copyrighted product will fail fair use. Training that learns from works to generate new, non-substitutive outputs will likely succeed.
This distinction aligns with Warhol and Google v. Oracle: transformation turns on purpose, not on medium. A change from text to algorithm matters only if it changes the function. When copying repurposes expression into data for computation, that function diverges enough to justify fair use.
For developers, these cases highlight the need to document how training data is sourced and used. Maintaining clear records that separate analytical use from expressive reproduction will strengthen future defenses. For rights holders, the message is to focus on demonstrable market harm rather than speculative licensing theories.
What Comes Next
The 2025 decisions are the first chapter, not the last word. Three developments will shape what follows.
- Evidence-Based Market Analysis. Plaintiffs will need concrete proof of substitution or measurable loss to challenge AI training. Courts have made clear that conjecture about future harm will not suffice.
- Transparency Requirements. Judges in Bartz and Kadrey both noted the value of technical documentation. Expect future discovery battles over dataset provenance, filtering, and output testing to prove whether copying is truly non-expressive.
- Possible Legislative Codification. The Copyright Office has opened rulemaking dockets on AI and training data. Congress may eventually distinguish “analytical” from “expressive” copying, echoing the courts’ reasoning.
Until then, these cases provide the working rule of thumb for 2026 and beyond.
Final Thoughts
The takeaway here is that transformation protects learning; substitution invites liability. Courts are adapting the flexible fair-use doctrine to modern technology without rewriting the statute. As AI becomes embedded in creative and analytical work, the emerging doctrine rewards transparency and proof over speculation. The law is beginning to balance innovation and authorship—by asking what the machine does with the works it reads.
Editor’s Note: This article was updated on 10-14-25 to correct the name of the judge in Bartz v. Anthropic to Judge William Alsup
Image Source: Deposit Photos
Image ID: 799061332
AuthorBiancoBlue
Join the Discussion
3 comments so far. Add my comment.
Anon
October 13, 2025 11:55 amPatrick,
Your view sounds more in attempting (impermissibly) to protect “in the style of” rather than any single actual item.
Perhaps the “just seems wrong” stems from your wanting copyright protection to protect more than what it legitimately protects.
Patrick Kilbride
October 13, 2025 08:23 amThanks for sharing this insightful analysis. Among other things, this finding just seems plain wrong:
“ the court found that creativity had “diminished significance” because the copying targeted language structures, not expression.”
Have these judges read Joyce, Faulkner, Shakespeare – or any other author – whose syntax and language structure are integral to their creative genius?
Anon
October 10, 2025 10:16 amProf. Edward Lee (ChatGPT is eating the world) has a similar triangulation approach to the three cases dealing with Fair Use for the AI Engine training scenario.
Add Comment