What Reuters Ruling Means For AI Fair Use And Copyright
By David Ben-Meir (February 28, 2025)
In Thomson Reuters Enterprise Centre GMBH v. ROSS Intelligence Inc., the U.S. District Court for the District of Delaware was the first to consider a defense of fair use in the context of artificial intelligence systems’ use of copyrighted content for AI training purposes.
Thomson Reuters sued ROSS Intelligence for copyright infringement based on ROSS’ use of Thomson Reuters’ copyright-protected headnotes, which comprised a portion of the input to train ROSS’ system to produce a competing legal research product. In deciding the parties’ cross motions for summary judgment, the Delaware district court held that ROSS’ fair use defense fails.
The Feb. 11 decision has drawn considerable attention in part because it is the first such opinion concerning AI systems, but it is not likely to have lasting effect in view of the avalanche of AI decisions to come — and soon. Furthermore, the opinion acknowledges that the uses at issue did not involve generative AI, which distinguishes it from the majority of cases on the horizon.
That said, Thomson Reuters makes two points that will resonate with copyright owners who are disputing tech companies’ unlicensed use of copyright-protected materials to train generative AI models, and their claims that such use is fair.
The fair use defense to copyright infringement requires consideration of at least the following four factors: (1) the use’s purpose and character, including whether it is commercial or nonprofit; (2) the copyrighted work’s nature; (3) how much of the work was used and how substantial a part it was relative to the copyrighted work’s whole; and (4) how the use affected the copyrighted work’s value or potential market.[1]
The Thomson Reuters court’s first point concerns the fourth factor, which it said “is ‘undoubtedly the single most important element of fair use,'” and its view that the use’s effect on the potential market may be based on the mere potential for the copyright holder to license its materials for AI model training purposes.[2] That is, if a rights holder potentially could license its materials for the purpose of training AI models, then that alone may be sufficient for finding that the fourth factor favors no fair use.
A copyright holder need not have actually licensed its materials — the potential to license is sufficient.
What’s more, one can reasonably infer from the court’s finding that a copyright holder should prevail on the fourth factor in any case where copyrighted materials are used to train a generative AI model. The reason is that the market impact analysis does not rely on how the AI platform is ultimately used, even if its use is distinct from the copyrighted works’ purpose.
That generative AI infringement cases necessarily involve using material to train AI models would arguably self-evidence market impact because the copyright owner potentially could have licensed its copyright-protected data to that same or another AI platform. That is, because its data was used for AI training without a license, the negative market impact on David Ben-Meir the copyright owner is necessarily established.
The second point relates to the first factor — the use’s purpose and character, including whether it is commercial or nonprofit, and whether the use of copyrighted material for AI training is transformative as compared with the original work’s purpose, which is a matter of degree.[3]
The Delaware district court concluded that ROSS’ use of Thomson Reuters’ materials for the purpose of training the AI model was not transformative.[4] In reaching this finding, the district court rejected one argument AI developers are making: that the copying happens only during an intermediate process in the AI model training and never appears at the model’s output.
The court based its rejection of this argument largely on a line of cases involving computer code relied on by ROSS but that the court found inapposite. These were primarily cases like Google LLC v. Oracle America Inc., in the U.S. Supreme Court in 2021, in which copying computer code was deemed necessary to manifest the desired transformative use.[5]
Because most AI model training does not concern using copyright-protected computer code to transform the code to a different use, the district court’s opinion may cause AI systems’ developers to temper their reliance on the computer code cases to support their transformative use claims. By the same token, copyright holders may cite this finding from Thomson Reuters to rebut claims of fair use based on the first factor.
Nevertheless, there are at least two reasons why the district court’s finding on this point does not end the issue regarding the fair use argument of intermediate copying for purposes of AI training. First, to the district court, it was clear that ROSS did not need to copy Thomson Reuters’ headnotes to train its AI model.[6] ROSS could have gone to the source materials — the court opinions themselves — and generated its own headnotes for training purposes.
But that analysis is not the same as one that considers what is necessary to train a generative AI model. One can envision reasonable arguments that such training requires providing the model as much data as possible to enhance its ability to generate new and useful content.
The second reason concerns an issue that the district court never directly addresses. Generally, the very purpose of an AI training model is to fragment training inputs into elements that can be organized into mathematical relationships that describe ideas and concepts. It is a mathematical process of extracting and abstracting from an expression the idea the expression conveys.
There would appear to be very reasonable arguments that an intermediate process of idea extraction using copyrighted material is transformative under the first factor of fair use, and perhaps arguably does not constitute actual copying at all, because as in Thomson Reuters, the only information the AI model is practically obtaining from the headnotes are the ideas behind them.
It is a distinct argument we should expect to see forcefully advanced in cases to come. Accordingly, while copyright holders may be keen to rely on Thomson Reuters to argue that intermediate copying to train an AI model is not a transformative use under the first factor, the question of whether generative AI is transformative in purpose or character remains — for now — an open one.
[1] 17 U.S.C. § 107.
[2] Thomson Reuters at *9, citing Harper & Row, 471 U.S. at 566, 105 S. Ct. 2218 (1985).
[3] Andy Warhol Foundation for the Visual Arts, Inc. v. Goldsmith, 598 U.S. 508, 143 S.Ct. 1258 (2023).
[4] Thomson Reuters at *8.
[5] Google v. Oracle, 593 U.S. 1, 141 S.Ct. 1183 (2021).
[6] Thomson Reuters at *10.
