OpenUK partnering with Meta opening the Llama 2 LLM
Amanda Brock
????CEO OpenUK/ SOOCon26; Computer Weekly 20th Most Influential Women Tech 23 & 24; Computing IT Leaders 100 23 &24; Board Member; Advisor; Writer; International Keynote; Editor: Open Source Law, Policy & Practice; AuDHD
OpenUK split its Summer State of Open: The UK in 2023 report into two - ‘Economics of Open Source’ and ‘AI Openness’? -? in the full knowledge that the Llama 2 release would take place a few days later. ? With ‘Thought Leadership’ from both the Turing Institute and the Tony Blair Institute for Global Change calling for AI Openness, and the Open Source Initiative (OSI) explaining its view that there needs to be a new category of ‘Open Source AI’ to meet the challenges of AI. The report pulls together a unique global overview of the status of AI Openness and open innovation as of July 2023.? I shared thoughts on the first part, the Economics of Open Source Report already and here I delve into AI Openness Report, and the Llama 2 release and what they mean to open source.?
OpenUK partnering with Meta
OpenUK’s Board’s unanimous decision to partner on the Llama 2 release was possible as, unlike most open organisations, OpenUK has a broad remit across the various opens which was established at our inception in 2020. As a younger organisation for the open communities we were able to look across the span of the opens of innovation from day one. Covering open source software, open hardware and open data, and considering these in the context of open standards and open innovation has offered us a unique position for some time. We are members of and partner with organisations like the Linux Foundation, Eclipse Foundation and Open Source Initiative in open source software and the Open Data Institute in open data, allowing us a breadth of ongoing interaction that other organisations have not historically had but which they are now seeking to establish partly as a consequence of AI. AI requires us to look - as many past technologies have - at data alongside software and increasingly hardware. The focus created by AI makes this more apparent than it has been at any time, and follows the messaging OpenUK has been sharing for almost 4 years. We can’t consider software today without thinking about data and vice versa.?
This very broad remit in support of "Open Technology" or "openness," allowed OpenUK to work with Meta as the only open organisation to partner supporting Llama 2’s release.? Their web site and the release statement supported by OpenUK focus on open innovation a relatively undefined term:“We support an open innovation approach to AI. Responsible and open innovation gives us all a stake in the AI development process, bringing visibility, scrutiny and trust to these technologies. Opening today’s Llama models will let everyone benefit from this technology.”
This release on 18 July legitimised access to Llama 2 for the open communities - where the original LLaMA LLM was accessed only due to a leak beyond the duly licensed Research communities earlier this year - allowing more users to build on it in their projects in the open. That leak and the inevitable build on it gave absolute clarity of the potential power of open innovation in this space. It was not however under any realm of the imagination open source despite frequently being referred to it as such by AI experts.?
Llama 2’s release gives formal access to the foundational LLMs, software and algorithms, machine-learning model code, trained model weights, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements and delivers a manual and documentation, all of which are available to and usable by all, opening it beyond the research community originally licensed in February. This open sharing of the software in the AI? is an important milestone in AI.
What is open source software?
The tech elite have repeatedly shown ignorance of open source software and what it is. Often they have failed to understand its value and its nature as a community contract.? At its heart sits the Open Source Definition (OSD) created by Bruce Perens 30 years ago, largely based on the existing Debian Principles. All licences that are ‘open source software’ meet the 10 criteria of the OSD and have been approved by the OSI. Software which has its source code - the human readable code -? shared;? and which is distributed on an OSI approved standard licence without any modification is open source software. That’s the generally accepted legal definition.
Bruce explains that he was badly advised on trademarks in the early days of the OSI and did not trademark the term “open source” initially. By the time an application to register open source as a trade mark was made it failed, as the term had become somewhat generic. That has impacted the use of the term open source and goes some way to explain why its use hasn’t been better policed and enforced over the years.? This lack of enforcement has in turn fed into a lack of understanding… even amongst the tech sector. On occasion this has led to misuse of the term open source - sometimes referred to as ‘open washing’, implying the benefits of open source without delivering true open source.?
Definitions 5 and 6 of the OSD mean that anyone can use open source software for any purpose. There can be no restriction, no moral judgement or view of right or wrong and no restriction of another’s commercialisation of your open source licensed code. That is fundamental to the free flow of open source software which our digital economies rely on today. That cannot change as millions of packages have that as a legal reliance or dependency. That however does not sit well with the position of regulators on AI.
One of the clearest examples of open washing is MongoDB’sCEO describing open source as a ‘marketing tool’ . Using a bait and switch tactic, they took advantage of open source’s unprecedented ability to create adoption at pace effectively bating users then switching ?from an OSI approved licence allowing unfettered usage to a licence that was not OSI approved, despite continuing to make the source available. This type of licence which includes restrictions not OSD compliant such as limits on commercialisation is often referred to as source available or public source.
However the legal definition does not make open source software. Real open source also requires collaboration and a healthy community of contributors, hence the existence of a social contract aspect, which is so fundamental to success.? The switch in bait and switch is strongly regarded as a breach of that social contract.?
Open source has been described as enabling your competitors with your own innovation and that is a potential impact of open sourcing your code.? Open source is not itself a business model and in making decisions to open source in a business context you must be aware of this.
An ecosystem of millions of software packages and billions of users relies on the OSD and the meaning of open source, which merits it being given some care.?
Why the sudden interest in the meaning of open source and AI?
Regulation of course overrides licences and nothing can be done to opt out of regulation. An example we have lived with for many years is export control restricting the distribution of open source software into certain countries.?
The last couple of weeks has seen more press coverage of open source software’s meaning than has occurred in 2 decades.? Sparked by incorrect identification of? Llama 2 as open source by certain Llama senior execs despite the clarity on the? Llama 2 web site that it is open innovation - not open source software. Llama 2 is distributed on the Llama 2 Community licence and an Acceptable Use Policy (AUP). Understanding that - whatever the social media from some of Meta's most senior team has said - Llama 2 is not open source software - is important to the open source communities building on it but equally to AI leaders and regulators.?
"The Llama 2 community licence" is not approved by the Open Source Initiative, and clearly does not meet the requirements of the Open Source Definition. Llama 2 is not available for any purpose and the licence has commercial restrictions. Whilst it should not be described as? “open source” some of the social commentary from Meta executives’ use of this term to describe Llama 2 has led to claims of open washing and to the OSI requesting this be retracted and that Meta clarify this. The Llama 2 licence combined with its Acceptable Use Policy (AUP) set out the parameters of responsible use and are rightly described on the Llama site as? ‘open innovation.’
The value of AI Openness?
Even with Meta’s provision of a manual and documentation to support Llama 2, most people are not rushing to download it to build their own LLM. Speaking at The Future of Britain Conference in London, Vishal Sikka explained that we need more AI skills because across the globe only 1.5 million people can build an AI app and only about 200k can automate an AI system, while less than 50k can explain how ChatGPT works.??
领英推荐
This reality means that we need more people to learn.? Meta’s opening up of the Llama 2 LLM, even on this restricted basis, helps to support that learning by allowing access to a LLM for technologists to learn from and to develop? further, building their own products and tools.? It enables developers to build on it in the open and creates a transparent ecosystem, using the AI equivalent of GitHub, Hugging Face.?
This visibility enables transparency which allows trust that what is being done is in compliance with the AUP. These requirements on behaviour could not exist in open source software - ie there are restrictions on use of the code that remove the possibility of it meeting the OSD and being open source software.? The key values of open innovation however include transparency which in this context create both trust and control.?
A new kind of openness
At a time when our regulators and governments are struggling to clarify what “Appropriate Use” of AI is, the terms of the Llama 2 AUP may carry great influence and are of themselves after Meta’s continued global regulatory engagement likely based on the in-depth-discussions with regulators. See Margarethe Vesterger Vice President of Digital at the European Commission tweet on 19 June (complete with a picture of herself meeting with Mark Zuckerberg that day) that “#AI code of conduct in motion. Today with Mark #Zuckerberg @Meta, the conversation focused on how to mitigate risks in #OpenSource environment.”
The AUP and its restrictions have been relatively consistently missed by press and commentators who generally focus on the licence.? Whatever the licence says the AUP’s existence means Llama 2 is not open source and cannot be. Were Meta persuaded to change its Llama 2 Community Licence to an OSI approved one, the AUP’s existence would invalidate this as open source.?
The OSI itself understands change is coming and is currently working through a consultation to define what “Open Source AI” is. It is looking to create a shared set of principles that recreate the permission-less, pragmatic and simplified collaboration of open source for AI practitioners. Implying the current "Open Source" is not adequate. Their call for papers is open Beyond the OSI and open source software is open data which is fundamental to AI Openness and the remit of organisations like the UK’s Open Data Institute and Open Knowledge Foundation and licensing that is not related to software like the Linux Foundation’s CDLA Licence.
The UK and AI
“If the UK is to be a global leader in AI, it is also important that the UK takes a clear position on the Open Source Community.? The software that has been built by this community underpins the modern internet, from infrastructure to operating systems to algorithms. As a result of recent progress, Open Source is again enabling cutting-edge development in AI.?
The European Union, which tends to take a position of regulator of resort, has decided to weaken the position of the Open Source community through the AI Act - also the controversial Cyber Resilience Act and the Product Liability Directive are perceived as undermining open source software with a lack of understanding of user accountability at the heart of this misunderstanding.
This is a mistake in the promotion of innovation and the UK should not follow suit. Instead, we should use the opportunity to offer a different model. The UK has to show leadership in building its industry, including Open Source.? This requires taking a clear position on the value that openness brings. There is a risk that this technology is confined to the hands of just a few actors, with the potential dividends being too narrow,” said Benedict Macon-Cooney, Chief Strategist Tony Blair Institute for Global Change.
For the UK AI openness offers the opportunity to lead and has the potential to be a game changer.? Rishi Sunak emphasises, we have been a world leader in developing AI, 7 companies in the US and China have to date held 90% of AI’s compute power, largely through closed LLMs, which only they can access and which nobody can see under the hood of. It’s like the bonnet of every vehicle we drive being locked so nobody can see the engine. If something goes wrong we don’t know why or how to fix it, we can’t take it to a local garage and become completely dependent on the person or company who created it and who holds the key. That is obviously neither a safe nor a secure position to be in and it is one where the key holder has dominance. With no transparency into the key holder’s activity we also have no way of trusting that they are not in fact a bad actor.??
Experts in various disciplines must collaborate to understand and manage the risks in AI Openness. The UK’s leadership in both open source software - where it is #1 in Europe - and its global AI positioning means it has a real opportunity to lead. To be successful ?Ian Hogarth and Lord Camrose, our AI Minister must bring the right people with the deepest understanding together to collaborate. If they do and with the The White House’s support of the PM’s conference the UK may well lead in opening up AI.?
Relevant Links:
Technosocial Sensemaker, Business Builder, CEO of Ipseita Limited
10 个月The benefits of opening Llama 2 should not be underestimated. I believe time will show that Its performance can address many of today's challenges for process automation, which is hugely valuable. Opening it creates a ratchet effect; its capabilities are secured into broad public access.
Open Models [Enthusiast | Trainer] ? Knowledge Base Maintainer (somehow)
1 年I don't think that a (somehow opened) resource should be called open innovation as it is a process, a flux of knowledge. Open Innovation occurs of course with LLAMA, but it happens as well with open source software. I'm not in the IA field, I dive into LLaMa topic for openness purpose, but maybe LLaMa should be called an open model to define the resource, a word used by the Hugging Face community : https://huggingface.co/blog/llama2. Even if I consider the OSI statement on LLaMa as a bad move (no critics on the lack of source code...), the initiative to define open AI is coming at a great moment, wording looks pretty messy for now and what I see looks good. I also did an article to share my point of view on the topic : https://www.dhirubhai.net/pulse/llama-lia-de-meta-plut%25C3%25B4t-open-model-quopen-source-simon-rossi One thing which surprised me the most in Febuary/first release, LLaMa was ? part of Meta’s commitment to open science ?. An intent to help ? researchers advance their work in this subfield of AI ? without giving them the ability to do fundamental modifications and to create other LLM makes me particularly sceptical about their strategy.