The Challenges for License Compliance and Copyright with AI
So you want to use AI-generated code in your software or maybe your developers already are using it. Is it too risky? Large language model technology is progressing at rapid speeds and policy makers are ill-equipped to catch up quickly. Anything resembling legal clarity may take years to come about. Some organizations are deciding not to use AI at all for code generation, others are using it cautiously, but everyone has questions.
Before we begin, a few disclaimers
While this is a global issue, this blog is primarily focused on the United States as this is where most companies providing generative AI are based and therefore where most legal challenges are taking place. However the law in the US settles, it may settle very differently in other jurisdictions. We are not lawyers, in the US or anywhere else, and none of this should be construed as legal advice. Likewise, this blog is only a snapshot of where things are at this specific moment in time. The policies and legal precedents surrounding generative AI are still evolving and subject to all kinds of changes including unexpected twists and upsets.
With that out of the way, let's get to the heart of the matter – two simple questions users of AI-generated code are asking and two complex and possibly unsatisfying answers.
Can I be sued?
If your developers are using AI-generated code, and that AI trained on open source software, you are likely concerned that the generated code is sufficiently similar enough to open source software to require compliance with a license. If so, the worst case scenario puts your project at risk of being subsumed under a GPL license, but even the best case scenario of simply requiring an attribution is onerous if it means tracking AI-generated code and identifying the open source code it's similar to, a task not yet solved by commercial software aside from cumbersome plagiarism detectors with high false positive rates. It's a tricky situation and there isn't much guidance on best practices yet.
Good news for users of GitHub Copilot, though. In late September Microsoft, GitHub's parent company, made an announcement regarding copyright and code generated by their models. If you get sued because you used code generated by Copilot, Microsoft promises to pick up the bill if you used Copilot with the appropriate filters (like duplicate detection) turned on:
领英推荐
"As customers ask whether they can use Microsoft’s Copilot services and the output they generate without worrying about copyright claims, we are providing a straightforward answer: yes, you can, and if you are challenged on copyright grounds, we will assume responsibility for the potential legal risks involved."
That doesn't mean the law is guaranteed to settle on Microsoft's side, but it does signal loudly that they're confident they have a strong legal case. A lawsuit alleging Microsoft, GitHub, and OpenAI infringed on open source licenses and copyrights when training their models is working its way through the US legal system and likely will be for some time. Microsoft argues that anyone has a right to look over public code on GitHub to understand and learn from it and even write similar – but not outright copied – code, and that includes their models. OpenAI hasn't promised to pay legal fees for its users, but if Microsoft's argument holds up, it will be good news for OpenAI and its users too.
Can I sue?
If you are using AI-generated code in whole or in part to create software, you're probably wondering if you have rights to the code and whether or not using AI-generated code affects the rights to the rest of your code. So far the US courts and US Copyright Office are holding strong that for the purposes of copyright and patents an author must be human, though this requisite is likely to continue to be challenged.
For now, AI-generated code is largely viewed as having no copyright. If anyone has a claim to the copyright of outputs, it's the rights holders of the models themselves and as far as the big players go, like OpenAI, they seem to not particularly want those rights. OpenAI explicitly states in their terms of use that they transfer ownership and rights to you, the user, thereby using a contract to navigate past copyright questions altogether. However, this contract also states that you don't own anyone else's output, meaning you can't stop someone from generating and using the same thing you once generated and used.
But what about your project as a whole? Can you copyright the work as a whole if you used AI-generated code within it? The US Copyright Office has issued some guidance on the topic of AI-generated materials but cites no examples of AI-written (or partially written) software. At some point a copyright application that involves software and AI-generated code will surely come to light, but for now we must consider general guidance and examples of other art forms and infer how their treatment might translate.
Continue reading ?? https://go.mend.io/3TBkSjM