Intersection of Copyright and A.I.
Artificial Intelligence (AI) continues to evolve, copyright law becomes increasingly complex for artists, creators, developers, and policymakers. This is why understanding copyright within the context of AI is important.
Section 1: Understanding Copyright Law
Copyright law ensures protection for creative works and creators' rights. It protects human creativity and gives authors, artists, and creators exclusive rights to their original works, allowing them to control how their work is used, distributed, and monetized.
In addition to encouraging creativity, this legal framework promotes the distribution of knowledge and culture.
Key concepts in copyright law include:
Originality
For a work to be eligible for copyright protection, it must be original. The work must be independently created by the author and show a minimum degree of creativity, which requires human authorship (See "The Machine as Author").
The U.S. Copyright Office asserts that works must be created by human authors. Consequently, machines, computers, animals, or any other nonhuman creators cannot benefit from U.S. copyright laws.
The copyright law only protects “the fruits of intellectual labor” that “are founded in the creative powers of the mind.” Trade-Mark Cases, 100 U.S. 82, 94 (1879)
This brings up a complex question: can the output of AI be protected by copyright under current law?
Several scholars, courts, and the U.S. Copyright Office have concluded that AI system outputs can't be protected because they must be created by humans. However, what about a combined work of humans and machines?
In some instances copyright law can protect works made with machines. For example, pictures taken with digital cameras are protected, so are software edited music. So, could AI simply be considered another tool that humans use to create copyright-protected works?
It depends on how much human involvement is involved in the process. For example, if you instruct AI to write a book for you, the resulting text is unlikely to be protected under copyright. But what if you fine-tune the story by continuously modifying and instructing AI, such as by adding characters, changing plots, altering settings, or generating alternative endings, would the result be eligible for copyright protection? Maybe. Nonetheless, it is unclear at what point a work becomes protected when there is a significant amount of human involvement.
Derivative Works
Under US copyright law, the copyright owner holds the exclusive right "to prepare derivative works based upon the copyrighted work." Now, what exactly is a "derivative work"? It's essentially a creation that builds upon one or more preexisting works. Enter ChatGPT, which is trained on a treasure trove of preexisting works and churns out content based on that training. It's a fascinating, albeit legally murky, territory we're navigating here. According to Daniel J. Gervais of Vanderbilt University Law School "this definition of the right could loosely be used as a definition of machine learning when applied to the creation of literary and artistic productions because AI machines can produce literary and artistic content (output) that is almost necessarily “based upon” a dataset consisting of preexisting works." But a major problem with human creation is that it is almost always influenced by something else that the author has read, seen, consulted, experienced, or been inspired by. In essence, "all cultural production is derivative."
Fair Use
The fair use doctrine allows you to use a copyrighted work under certain conditions without the author's permission. This doctrine keeps copyright law flexible in order to prevent a rigid application that stifles the very creativity it is supposed to encourage.
Section 107 of the US Copyright Act provides that fair use of a work "for purposes such as criticism, comment, news reporting, teaching (including multiple copies for classroom use, scholarship, or research)" is not an infringement of copyright.
Under this doctrine, codified in Section 107 of the Copyright Act, people are allowed to make fair use of prior works without taking away the copyright holders' control and profit.
Fair use doctrine is a helpful tool, but it isn't always easy to understand. In order to determine whether a use is fair, courts consider four factors:
It is important to note that these four factors are not exclusive, but are rather the primary - and in some cases the only - factors that courts consider. (See Copyright and Fair Use: A Guide for the Harvard Community.)
AI developers have been using this doctrine to defend themselves against the copyright holders of the input data on which AI developers train their language models. They claim that training AI models involves transformative use, a key factor in determining fair use. The term transformation refers to the use of an original work in a new and different way, adding a new meaning, expression, or message to it. In addition to copying the original works, developers are creating new, innovative applications that benefit society by training AI models. In addition, AI models can generate new art, music, and literature, which can be regarded as transformative uses of original copyrighted material. AI developers also argue that copyrighted materials are not harmful to the original works' market value when used for training purposes. It may even be argued that artificial intelligence-generated content can increase the visibility of original works and increase their appreciation, resulting in a rise in demand for them. Additionally, copyrighted materials are often used for training AI models on a large scale, involving vast amounts of data, making it impossible to seek permission from all copyright holders separately. AI developers can therefore use copyrighted materials for training purposes under the fair use doctrine.
In the case of Authors Guild, Inc. v. Google, Inc., No. 13-4829 (2d Cir. 2015), which involved the mass digitization of millions of books from libraries, the judge stated that the doctrine applies because "Google Books is also transformative in the sense that it has transformed book text into data for purposes of substantive research, including data mining and text mining in new areas."
On the other hand, copyright holders of the training data argue that the use of their copyrighted materials without permission constitutes copyright infringement, regardless of whether the use is transformative. They contend that AI developers are profiting from their works without providing any compensation, which undermines the value of their intellectual property. The fair use doctrine is intended to balance the interests of copyright holders and the public, but it should not be used as a blanket defense for commercial exploitation of copyrighted materials.
Furthermore, the authors argue that AI-generated content can sometimes be very similar to the original works, raising concerns about direct copying and plagiarism. In such cases, the fair use defense becomes weaker, as the use is not sufficiently transformative and directly competes with the original works. The authors also emphasize the importance of protecting their rights and ensuring that they receive fair compensation for the use of their works. (See Copyright and the Training of Human Authors and Generative Machines - Robert Brauneis - 47 Columbia Journal of Law and the Arts (forthcoming 2025).
Section 2: AI and Copyright Challenges (Training Data)
AI platforms are like voracious readers, devouring vast amounts of text from the internet. They feast on a variety of content—websites, articles, books, social media posts, and academic papers. But here's the kicker: these platforms have no clue where the data comes from. They simply see the text/data and learn the intricate dance of words, phrases, and sentences, and generate content that is often indistinguishable from human work.
And this brings us to the elephant in the room. The bulk of this text (i.e., training data) is likely under copyright protection, except for those items in the public domain—facts, discoveries, or works whose copyright term has expired. It's a tangled web, and navigating it is no small feat. But that's the world we live in, where AI and copyright law are on a collision course.
It is very likely that most platforms, such as ChatGPT have violated copyright laws while training their datasets, which is why so many copyright actions have been filed against them.
As AI continues to advance, it is important for creators, developers, and policymakers to navigate these legal complexities and ensure that copyright law evolves to address the unique challenges posed by AI-generated content.
There are some of the interesting lawsuits that address this issue:
Andersen v. Stability AI et al.,
On 6/2/2023, a group of artists filed a class-action complaint in the Northern District of California. Their target? The companies behind a trio of A.I. art generators. The artists argue that these services have violated copyright and unfair competition laws. They claim that the A.I. tools have unlawfully scraped and used their artwork (over five billion images were scraped (and thereby copied) from the internet for training purposes for Stable Diffusion through the services of an organization (LAION, Large-Scale Artificial Intelligence Open Network) paid by Stability) in training datasets, essentially hijacking their original works without permission to train the A.I. This, they say, allows users to generate works that are not sufficiently transformative from their existing, protected creations, making them unauthorized derivative works. If the court sides with the artists and finds the A.I.'s works to be unauthorized and derivative, the companies could face substantial infringement penalties.
Getty Images v. Stability AI
On February 3, 2023 Getty Images, an image licensing service, filed a lawsuit against the creators of Stable Diffusion on the grounds that the defendants trained AI tools using data lakes with thousands — or even many millions — of unlicensed works. According to Getty, Stability AI improperly used its photos, violating copyright as well as trademark rights.
The New York Times v. Microsoft and OpenAI
The New York Times filed a copyright infringement lawsuit on 27 December 2023 against OpenAI and its partner, Microsoft. The accusation? OpenAI allegedly scraped millions of the Times' articles, along with content from other creators, to build the knowledge base powering ChatGPT.
Section 3: The Debate on AI-Generated Works (AI Output)
The question of whether AI-generated works can be considered original and therefore eligible for copyright protection is at the heart of a heated debate. Traditional copyright law is designed to protect human creativity, but the rise of AI-generated content challenges this notion. Can a machine's output be deemed original, or does it inherently lack the human touch required for copyright protection?
Legal scholars, courts, and copyright offices have weighed in on this issue with differing perspectives. Some argue that AI-generated works should not qualify for copyright protection since they lack human authorship.
The U.S. Copyright Office has maintained that works must be created by a human author to qualify for copyright protection. This stance was evident in cases like?Thaler v. Perlmutter and Zarya of the Dawn v. U.S. Copyright Office, where it was ruled that an artwork created by an AI system could not be copyrighted because it was not a product of human creativity.
Again, some argue that the output of the AI systems should be protected under copyright law if a human is involved in guiding and refining the AI's output. This perspective stresses the importance of human involvement in the creative process.
The role of human involvement in AI-generated content is a crucial factor in this debate. If a person merely provides a brief prompt to an AI system, the resulting work may not contain enough human creativity to qualify for copyright protection. However, if a human continually modifies and directs the AI to fine-tune the content, the work may be considered original and eligible for copyright protection. This perspective highlights the need for a clear understanding of the level of human involvement required to tip the scale in favor of copyright eligibility.
Additionally, there are concerns about the creation of derivative works. AI systems often generate content based on preexisting works, raising questions about whether these new creations infringe on the rights of the original authors. The concept of fair use also comes into play, as it allows for limited use of copyrighted material for purposes such as criticism, commentary, and research. Determining whether AI-generated content falls under fair use or constitutes infringement is a complex and evolving legal challenge.
Another significant issue is that large language models and generative text models rely heavily on consuming scholarly articles, journals, and scientific papers for training purposes. The fundamental premise is that, regardless of the output produced by these models—whether it is a verbatim copy of the training materials or substantially similar in some manner—the copyright owners should still be entitled to compensation for the value their work generates. This is because their works serve as the foundation upon which these AI systems build and create value. It is also well-known that AI can sometimes produce output that is either identical to or very similar to the training material, in which case, the copyright infringement is evident.
Creators, developers, and policymakers need to engage in this ongoing conversation and find a balanced approach that protects the rights of authors while encouraging innovation.
Section 4: Impact of AI on creative industries such as art, music, literature, and software development.
Creative industries have undergone significant changes due to AI, including art, music, literature, and software development. These changes have both positive and negative impacts, reshaping the way creative work is produced, distributed, consumed and monetized.
Art
AI algorithms have been used to generate unique pieces of art. For instance, in 2018 the AI-generated artwork "Portrait of Edmond de Belamy" was sold at a Christie's auction for $432,500, showcasing the potential of AI in the art world. However, this raises legal questions about authorship and copyright. Who owns the rights to an artwork created by an AI? This is a complex issue that challenges traditional notions of copyright law.
In 2016, researchers from the Netherlands presented a new artwork generated by a computer, based on thousands of works by the famous 17th-century Dutch painter Rembrandt. The painting was printed using a 3D printer with a special paint base and UV ink.
Music
Deep Mind, a Google-owned AI company, has developed software that generates music from recordings. In addition to mimicking famous composers' styles, AI-generated pieces can be used to create entirely new genres. Copyright issues are important here as well. A question that has been ongoing in the music industry for some time is who should receive royalties when an AI composes a hit song,
According to Prof Gervais of Vanderbilt University Law School, "AI machine using a corpus of pop music can find correlations among the various songs and identify the elements (melody, harmony, pitch, etc.) that may be causing a song to be popular and then use this knowledge to write its own potential hit."
For example, Flowmachines, a laboratory funded by Sony, has developed a number of "pop" songs, including "Daddy's Car," composed in the style of The Beatles.”
Literature
In 2017, Google started funding an AI program to write local news articles. In 2016, a short novel co-written by humans and AI, entitled "The Day a Computer Writes a Novel," passed the first round of a Japanese literary competition. Legal implications include concerns about plagiarism and authenticity of AI-generated content. There are also ethical issues about AI's use in literature and whether it should be disclosed as such.
Software Development
In software development, AI tools like GitHub Copilot assist developers by suggesting code snippets and even writing entire functions. Although such use of AI can boost productivity, it raises legal concerns regarding code ownership and licensing. Who should be responsible for the copyright violations of code generated by AI?
While AI presents exciting opportunities for creating new works, it also poses substantial legal challenges. AI benefits can only be realized if policymakers and industry leaders work together to develop frameworks that address these issues and ensure legal and ethical standards are not compromised.
Conclusion
A significant amount of debate and legal interpretation has taken place regarding the issue of copyright protection for AI-generated works and the threshold of originality. The law must keep up with technological advancements by taking prompt action to navigate and resolve AI-related concerns.
Looking Ahead
In the next issue, we will discuss Machine Learning.
Thank you for joining me on this exploration of AI and law. Stay tuned for more in-depth analyses and discussions in my upcoming newsletters. Let's navigate this exciting and challenging landscape together.
Connect with me
I welcome your thoughts and feedback on this newsletter. Connect with me on LinkedIn to continue the conversation and stay updated on the latest developments in AI and law.
Disclaimer
The views and opinions expressed in this newsletter are solely my own and do not reflect the official policy or position of my employer, Cognizant Technology Solutions. This newsletter is an independent publication and has no affiliation with #Cognizant.
Transform your vision into reality by aligning your team's understanding, which will dramatically accelerate your growth and finally free you to lead.
2 个月Very informative and nicely done.
Information Technology Manager | I help Client's Solve Their Problems & Save $$$$ by Providing Solutions Through Technology & Automation.
2 个月Diving into copyright with AI feels like walking a tightrope. What do you think about striking that balance? Laura Reynaud Esq., LL.M.
Helping businesses manage Generative AI and AI chatbot answers | AI trainer | Speaker
2 个月Very insightful and looking forward to future work on AI + law in the Middle East
Exciting edition! Copyright and AI-generated works are such crucial topics in today’s creative landscape. The balance between innovation and legal frameworks is constantly evolving. Looking forward to the insights and discussions this sparks.