Open source liability in the Age of AI
I've received a few questions recently about open source technologies and liability, and how open source developers open themselves up to liability. Today, nearly every commercial product utilizes open source technologies. When you produce an open source product and publish it through common platforms like NPM, Maven, and GitHub, there is no way to know who is downloading it how it is being used. And when used commercially, new liabilities emerge.
One driver of this liability is the inbound legislation from the European Union, makers of the Cookie Banner popups which plague the internet (Cookies, the GDPR, and the ePrivacy Directive - GDPR.eu). Under the EU's evolving Cyber Resilience Act (CRA), EU is extending the definition of 'products' to now include software products and services. The latest press release on the legislation, and the following excerpts, can be found here:
Under the new liability directive, the definition of ‘product’ will be extended to digital manufacturing files and software.
While it specifically excludes open source software it does not exclude the products and services which integrate them.
Free and open-source software that is developed or supplied outside the course of a commercial activity is excluded from the scope of the directive.
What does this mean, when open source is routinely included in commercial products, or even when open source products are supplied with value add hosting and support?
For example, many of the most popular open source products available are built by commercial entities (React, Angular, MongoDB, etc.). It also appears to mean that if you release an open source software and then host it as a SaaS, you are also potentially liable for the use of that SaaS. Quite specifically, SaaS products are included in the regulations.
It has therefore been decided that online platforms can also be held liable for a defective product if they present the product or otherwise enable the transaction for the sale of the product in a way that would lead an average consumer to believe that the product is provided either by the online platform itself or by a trader acting under its authority or control.
There appear to be several new areas of risk emerging. To understand the potential impact, Debian, the popular makers of a flavour of Linux operating system, recent published their concerns which do a good job at capturing the impact: https://bits.debian.org/2023/12/debian-statement-cyber-resillience-act.md.html. Specifically, they note:
Imposing requirements such as those proposed in the act makes it legally perilous for others to redistribute our work and endangers our commitment to "provide an integrated system of high-quality materials with no legal restrictions that would prevent such uses of the system".
It appears that even non-profit organizations could be implicated in expanded liabilities, including the Linux Foundation, Debian, Apache Foundation, and others who receive significant contributions and pay salaries to certain roles.
These new rules may have a major impact on the creation and use of open source software, and alter the shape of the community providing these products. And as Debian notes, software gets more vulnerable when it is closed source.
Liability in the age of AI
And what about AI-based software? Now that AI is capable of producing sophisticated software solutions, the residual risks associated remain with people. And since AI works cannot be copyrighted (The US Copyright Office says an AI can’t copyright its art - The Verge), these works fall into dubious legal zones until legislation or court rulings catch up with technology. Is the person who produced the software the one who wrote the prompt or the underlying LLM service which generated the code? Or even the intermediary service (like a chat user interface) which connected the two?
Since Summer 2023, there have been numerous multi-billion dollar lawsuits filed based on the use of AI, most famously New York Times vs. OpenAI.
Here is a highlighted list from OpenAI and ChatGPT Lawsuit List – Originality.AI
Central to these lawsuits is the topic of copyright. While most open source software includes a license (learn more at Open-source license - Wikipedia) these licenses also contain copyright and liability statements which are lost or discarded when LLMs are trained on the information.
The use of LLMs to produce software products, whether using Copilot, GPT, or any other service, dramatically increases the speed, quality, and accuracy of developer code while also automating the writing of test scripts and documentation. However, it is unreasonable to expect that the story of liability here will end at use of New York Times articles. The software industry is larger and just as litigious as any other industry (Google v. Oracle Explained: The Fight for Interoperable Software - IEEE Spectrum), so that AI generated code in your killer app may just be the next target.
For Coders
Although I am not a lawyer and cannot offer advice, I strongly suggest that people consider how they produce and 'productize' open source software, especially in the age of AI, to limit their personal liabilities. It is unclear how the issue will settle, and we are years or decades away from the courts deciding how this will come down. While we wait, your liability as a product creator remains open for 25 years, according to the EU's new rules.
As a solo coder attempting to do some good for the world, you likely don't have deep pockets to defend your NPM widget against legal liabilities. Perhaps it is time for a new form of open source license, or a new way of sharing these products to the world.
It may be time for a hackathon to figure this out.
We can expect more and more use cases such as this where legislation...policy...court rulings play catchup with exponential IT. Andrew Kum-Seun, Marc Perrella & Ari Glaizel