AI Technology and Legal Effects Continue Rapid Evolution

by Craig Gipson

The whiplash-inducing 2025 news cycle can relegate even the trendy topic of AI to the back pages. But the rapid pace of technological development and the law’s effort to keep up show no signs of slowing this year. Since January, the U.S. Copyright Office issued the second part of its report on AI and copyright, a federal court issued the first U.S. legal decision on whether AI training infringes copyright-protected content, and book publishers joined journalistic organizations in licensing content to AI providers.

Can I Copyright That? Copyright Office Report on AI and Copyrightability

In January, the U.S. Copyright Office issued Part II of its report on artificial intelligence. [1] Many ECPA members are aware of the Office’s initial guidance from 2023 on AI-generated materials and the best practice of treating content originating from an AI model as ineligible for copyright protection. The Office’s most recent report—part of a years-long effort to more broadly investigate and evaluate the implications of generative AI technology on copyright law—does not alter the state of copyright law in the U.S. but allows for more nuance in its application.

Ultimately, the Office holds its position that copyright only protects human authorship and purely AI-generated material remains ineligible for protection. However, the Office softens its stance in acknowledging that what constitutes “human authorship” is not as much of a binary determination as a point on a spectrum. And despite its efforts, the Office’s report does not always provide clear guidance as to where on that spectrum rights holders should aim to preserve copyrightability.

While the report delves into the gray areas of AI and copyright, it acknowledges the parts that remain black-and-white, such as: (1) mere prompts of an AI model to produce output are not sufficient human control for copyrightability (“[t]he fact that identical prompts can generate multiple different outputs further indicates a lack of human control”); and (2) human input in elements other than the expressive elements of a work (those which would be subject to copyright protection if human-created) is also inadequate.

But the Office focuses much of its attention on the pressing question of what to do with works that contain both human and AI-generated material, namely, when does the human author’s contribution satisfy the minimum control necessary to meet the copyrightability threshold? In trying to provide some guidance on this point, the Office takes a more expansive view of what constitutes “assistive AI” and is particularly sympathetic toward what it terms “expressive inputs.”

While many “assistive AI” functions are well-known (spell check, auto correct, etc.), the Office contemplates a broader number of functions being “assistive.” Assistive AI tools are those which “enhance human expression” but do not “stand-in for human creativity,” and therefore do not limit copyright protection. This may be true even when the AI serves in an editor-type role. For example, the Office notes that an artist may input “expressive inputs”—human-created works such as illustrations—into an AI model and ask that the AI model modify the color or layer the image. While we may think of choices such as color selection as expressive, the Office cites an argument that the AI output generated from such an expressive input “may have a greater claim to authorship” because “there is a limited range of specific expressive output that is objectively foreseeable as a result” of the human artist’s contribution. In other words, the AI’s starting point being a human-created work reduces the unpredictability of AI output, increasing human autonomy and the likelihood of copyrightability. The Office views this use of expressive inputs as the user “prompting a generative AI system and referencing, but not incorporating, the output in the development of the user’s work.” Although the report leaves room for interpretation as to the distinction between “referencing” and “incorporating.”

If the standard for copyrightability of works created by both human and AI contributions still sounds unclear, the Office acknowledges as much. The report closes with a note that courts will continue the application of copyright law in AI uses on a case-by-case basis.

For publishers, the key question of copyrightability and AI remains the degree of human control and the way in which publishers utilize AI tools. For more risk tolerant publishers, the report opens the door for greater AI use in aiding content creation, provided the ultimate expression of content is still of human origin.

A note on ECPA advocacy:  ECPA submitted a public comment to the Copyright Office in response to the Office’s request for input on AI policy. Of more than 10,000 public comments submitted, the Copyright Office made 207 citations, four of which referenced ECPA’s comment. The advocacy efforts of ECPA publishers on policy matters such as copyright and AI can impact not only on Christian publishing, but all creative industries.

A Fair Use Victory In Court For Rights Holders? Not So Fast…

AI and copyright intersect not only on the issue of copyrightability but in the fair use question of training of AI models. It is well-established that frontier AI models trained on an almost incalculable number of copyrighted works without authorization. AI providers claim their use of copyrighted works to create large language models is “transformative,” and therefore protected as fair use. Multiple copyright infringement lawsuits remain pending, but the first federal court decision on the issue found in favor of the rights holder, deeming the training of the AI model infringing. [2] But before the publishing and creative industries begin celebrating, copyright commentators warn that the case’s unique set of facts means its precedential value is unknown.

Thomson Reuters owns the widely used legal research database, Westlaw. Part of Westlaw’s value derives from its proprietary “headnotes,” which are summaries of key legal points from court opinions. AI provider ROSS Intelligence requested to license Westlaw’s headnotes but Thomson Reuters declined. As an alternative, Ross then hired another company, LegalEase, to create case summaries largely resembling Westlaw headnotes, and input these summaries into Ross’s AI tool.

The court found that more than 2,200 of Ross’s summaries were copied and substantially similar to Westlaw’s headnotes. In addition to finding that the purpose of Ross’s use was commercial—weighing against a finding of fair use—the court addressed one of the pressing questions in how AI tools use copyrighted material. A common element of the fair use defense in these cases is that AI tools only input copyrighted material as an “intermediate step” in the creation of the model. The AI models do not, according to Ross’s argument, retain or reproduce the copyrighted material but instead turn the material “into numerical data about relationships among legal words to feed into its AI.” As discussed further below, the court rejected this position but noted the potential limited applicability of its finding.

Finally, the court found that Ross’s AI tool was a market substitute for Westlaw, which was the coup de grâce to any fair use defense. Allowing Ross’s tool to operate under a fair use framework would undermine current and potential derivative markets for Westlaw, including licensing for AI training datasets.

Some of the questions surrounding the precedential value of this decision arise from the nature of Ross’s AI tool: it is not a generative AI tool but rather a legal search engine. Ross’s model does not produce content that mimics Westlaw headnotes. This weighed in favor Ross’s fair use argument, but the court found the commercial purpose and damage to Westlaw’s market were determinative. The court acknowledged that its reasoning may or may not apply when used on a generative AI framework. Generative AI providers may assert a similar argument to Ross—that their models “learn” the patterns of language from the material they train on without further using the material in any other way which would violate the rights of a copyright owner. But these providers are aided by the fact that their models produce original content, which may be found to be an additional step toward a “transformative” use.

There is one other layer to the generative AI training-and-infringement series of cases not found in Thomson Reuters v. ROSS Intelligence: the legality of the training dataset. Many large language models trained on vast quantities of copyrighted material which were infringing in their own right. A well-known training dataset called Books3 contained at least 183,000 unauthorized copyrighted works that we know of. This month, The Atlantic reported that Meta trained it AI models on pirated works from the infamous LibGen library. [3] Even if AI providers can show training for generative AI models to be fair use, was the pre-training gathering of these materials infringing? According to The Atlantic, Meta downloaded a massive number of illegal files containing pirated works. That action harkens back to the facts supporting the music industry’s lawsuits against Napster users 25 years ago. Each download is a reproduction of a pirated file and a clear copyright violation under statute. Can AI providers successfully engage in “copyright laundering” and turn pirated training datasets into fair use-protected AI models? Only time and the courts will tell.

Better to Ask Forgiveness Than Permission? AI Licensing Content from Rights Holders

With the copyright implications of AI training in flux, AI providers began seeking licenses from rights holders in the past year. First, we saw a series of journalism organizations reach arrangements with some of the major AI players: the Financial Times, News Corp., The Atlantic, Time, and Vox Media granted licenses to OpenAI,  while Reuters, Hearst, and USA Today reached deals with Microsoft. Next, reports emerged that some book publishers were exploring similar opportunities, the most well-known to-date being HarperCollins’ license with Microsoft.

As licensing opportunities arise, there are some basic principles publishers should keep in mind.

1. First, publishers need to weigh the costs and benefits of AI licensing. While much has been written about the risks of licensing to AI, there are risks to sitting out as well. Concerns about piracy, content replication, and customer intermediation are valid, but so are concerns about visibility and discoverability. As search technology becomes more reliant on AI, will potential readers be able to find your content when they submit a query?

2. Rights. Publishers need to know what rights they have acquired from authors and if those include AI licensing. If not, what kind of contractual amendments are needed and how should those be delivered to authors and agents? Also, publishers should review or implement AI policies which allow for desired AI uses.

3. Communication. Communicating with authors and agents about AI opportunities is key. Publishers will play an important educational role in helping authors understand the value and risks of potential AI uses.

4. Opportunities. Publishers that are not approached by AI providers may need to research options. Independent publishers could consider content aggregators that can offer larger quantities of content to technology companies for licensing. In an interview, Jason Malec told us that Gloo is seeking to build a robust library of legally-sourced Christian content for Gloo’s own offerings, and if publishers choose, for licensing to other AI providers. While Gloo is pursuing Christian content publishers, other AI providers beyond the frontier models may offer opportunities, such as ProRata AI.

5. Types of Rights. AI rights are not uniform. There are differences between allowing AI providers to train models on content versus using content for inference purposes. Understanding the rights a publisher acquired from its authors and the rights it wants to grant to AI providers is key. As noted above in the Gloo example, Mr. Malec provided that Gloo’s model offers a la cart options for publishers to select the rights being granted.

6. Timing. Given the uncertainty of AI’s copyright status, there is corresponding uncertainty in the licensing market. If courts find training AI models to be fair use, will the market for licensing dry up overnight? The EU’s AI Act requires disclosure of training data but once enough AI-generated data exists, will AI providers even need future licenses from rights holders when they can train on uncopyrightable AI data?

That AI is the next disruptive technology in publishing is unavoidable. But just as other technologies present certain risks, they also bring new ways of communicating the Christian message. ECPA will continue monitoring AI developments to provide members with relevant information for evaluating licensing opportunities, AI policies, and contractual provisions.

 

Endnotes

[1] https://www.copyright.gov/ai/Copyright-and-Artificial-Intelligence-Part-2-Copyrightability-Report.pdf

[2] Thomson Reuters v. ROSS Intelligence, No. 1:20-cv-613-SB (D. Del. Feb. 11, 2025)

[3] Reisner, Alex. The Unbelievable Scale of AI’s Pirated-Books Problem. The Atlantic. March 20, 2025. https://www.theatlantic.com/technology/archive/2025/03/libgen-meta-openai/682093/

 

This article is provided for informational purposes and is not intended as legal advice. This article was first published as an ECPA Legal Update.