Houston, We Have...The Open Source AI Definition
Fernando Adrián García Marc
CLO @ Fossity | #OpenSourceSoftware #Auditing #SoftwareLicensing #MergersAndAcquisitions
The Open Source Initiative (OSI) recently released the first official definition of “Open Source AI” with the Open Source AI Definition (OSAID ). This milestone follows years of industry debate on what it means for artificial intelligence (AI) to be truly open source, considering the complexities of modern AI models and their creation. OSAID clarifies the standards required for an AI model to be genuinely open, aligning developers, policymakers, and industry stakeholders on consistent terms and conditions. This new definition aims to ensure transparency, accessibility, and reusability for AI technology in a way that software licensing standards have historically done for open-source software.
The OSAID has three essential requirements for an AI model to qualify as open source. First, it must provide detailed information about the training data used, enabling others to recreate the model or understand its biases and limitations. This includes the origin of the data, processing methods, and any licensing restrictions attached to the dataset. Secondly, the complete code necessary to build, train, and run the model must be available. This is critical for fostering an environment where developers can modify or enhance the model. Third, the model must disclose its settings and parameters, including the weights and tuning that impact its functionality. Together, these components ensure that users can not only utilize the model but also gain full insight into its construction and performance.
By setting these requirements, the OSI seeks to address the growing trend of "open washing," where companies claim their models are open source without meeting transparency or accessibility standards. The new definition also highlights a division within the industry. Major companies like Meta, which markets its Llama models as open source, restrict aspects such as commercial use for applications with over 700 million users and withhold training data. Such limitations mean that Llama, and similar models from other tech giants, fail to meet OSAID's openness criteria, despite being accessible for download and certain forms of usage.
Meta has voiced its reservations about OSAID, arguing that the AI landscape is too complex for a single open-source standard. The company justifies its stance on data restriction as a safeguard against potential misuse and to protect its competitive advantage. This approach is echoed by several tech companies, which view unrestricted access to their models as a business risk due to the resources and data involved in their development. For instance, Stability AI, another major player, places enterprise licensing conditions on companies generating over $1 million in revenue, despite presenting itself as an open-source model provider.
The creation of OSAID could also have legal implications. Revealing comprehensive information about AI models might expose companies to intellectual property and copyright lawsuits, especially given that many models are trained on datasets scraped from various public sources. Plaintiffs and courts might increasingly refer to OSAID to scrutinize companies claiming open-source status without full transparency. As legal challenges around AI development grow, OSAID could serve as a baseline reference for transparency and accountability in AI, encouraging companies to clearly define the openness of their models and methodologies.
The introduction of OSAID reflects the OSI’s continued commitment to open-source principles adapted to the AI age. While it lacks enforcement power, the organization hopes that the tech community and potential legal frameworks will adopt and uphold these standards. With widespread adherence to OSAID, the tech industry could foster an environment where open-source AI models genuinely democratize technology rather than reinforcing existing power structures. As the debate continues, stakeholders look forward to updates in the definition, particularly around complex issues like proprietary data licensing, which may shape the future of accessible and equitable AI development.
领英推荐
Sources:
Note: The preceding text is provided for informational purposes only and does not constitute legal nor business advice. The views expressed in the text are solely those of the writer and do not necessarily represent the views of any organization or entity.
#OpenSourceSoftware #AI #OpenAI #Technology #Business
AI Leader · CEO/CTO · MBA · Founder · Xoogler
1 天前The OSAID has been rejected by the Open Source community and should be repealed. Here’s some more background for future posts on the subject: https://samjohnston.org/2024/11/09/so-you-want-to-write-about-the-open-source-ai-definition/