Shedding Light on AI Transparency in Foundational Models
Anshul Kumar
Generative AI Technology Evangelist | 2x LinkedIn Top AI Voice | Digital Transformation Leader
Introduction
This weekend, the research paper "The Foundation Model Transparency Index v1.1," published in May 2024, caught my attention. I have been raising similar questions in various forums about the need for transparency regarding foundational models' information across different parameters. In this article, we will explore this topic together, guided by insights from this research report by contributors.
What is Transparency with Respect to Foundational Models?
Foundational models have become an integral part of the generative artificial intelligence ecosystem, serving as the backbone for many applications in this field. Given their significance, it is essential that developers and distributors clearly disclose the models' characteristics, data sources, and other attributes to consumers and end users.
Unfortunately, transparency has often been lacking, placing foundational model developers under scrutiny for not providing sufficient information about their models.
The Foundation Model Transparency Index (FMTI)
The Foundation Model Transparency Index (FMTI) was first introduced in October 2023 by Bommasani. The first iteration (FMTI v1.0) scored ten major foundation developers (e.g., OpenAI, Google, Meta) based on publicly available information regarding 100 transparency indicators.
These 100 indicators comprehensively characterize transparency for foundation model developers. Indicators are divided into three broad domains as described in Figure 1 below. indicators that are upstream of the model, indicators that relate to the model itself, and indicators that are downstream of the model.
A conceptual depiction of the foundation model supply chain, beginning with the primary upstream resources (i.e. data, compute) and transitioning to the foundation model, subsequent hosts (or distribution channels), and ending with downstream applications
Image source & credit - What is a foundation model? | Ada Lovelace Institute
Transparency Indicators
These indicators span matters such as the data, labor, and compute used to build models; the capabilities, limitations, and risks associated with models; and the distribution of models as well as the impact of their use.
The upstream indicators identify the ingredients and processes involved in building a foundation model. There are 32 upstream indicators.
The model indicators identify the properties and function of the foundation model. There are 33 model indicators
The downstream indicators identify the use of the foundation model, including details about its release. There are 35 downstream indicators.
Overall score of 2023
Score by domain, 2023
Scores by Major Dimensions of Transparency, 2023
Complete FMTI 2023 report can be read at 2310.12941 (arxiv.org)
The Foundation Model Transparency Index (FMTI) v1.1
To understand how the landscape has evolved in the last six months, a follow-up study (FMTI v1.1) was conducted. The same 100 transparency indicators were retained to enable direct comparison.
This time, instead of relying on public information, developers were requested to report relevant information for each indicator. Out of the 19 contacted foundation model developers, 14 provided reports related to the 100 transparency indicators.
The FMTI v1.1 results demonstrate ample room for improvement, as well as tangible improvements in transparency over half a year. On average, developers disclosed information that satisfied 58 of the 100 transparency indicators.
Transparency increased in subdomains that were especially opaque in v1.0
Several of the areas of the index that were least transparent in v1.0 show significant improvement in v1.1, including subdomains such as compute, methods, risks, and usage policy. For example, the total score across companies for the compute subdomain rose from 17% in v1.0 to 51% in v1.1. Compute is potentially one of the most intractable areas for disclosure as it relates to the environmental impact of building foundation models—and therefore could be associated with additional legal liability for developers and deployers—yet we see improvement across compute indicators.
The least Transparent subdomains
Data access, impact, and trustworthiness are the least transparent subdomains. Developers score 50% or less on 10 of 23 subdomains in the index, including 3 of the 5 largest subdomains impact (15%), data (34%), and data labor (50%). The lack of transparency in these subdomains shows that the foundation model ecosystem is still quite opaque.
There is little information about how people use foundation models, what data is used to build foundation models, and whether foundation models are trustworthy.
These low scores reflect the ongoing crisis in data provenance where companies share no information about the license status of their datasets, preventing downstream developers from ensuring they are complying with such licenses.
Why Frameworks Like FMTI Are Important
Frameworks like the Foundation Model Transparency Index play a crucial role by offering a consolidated view of various foundational models. They assess different transparency indicators, aiding decision-making for individuals and enterprises. Such frameworks are essential, and the development of more frameworks can further enhance transparency, ethics, and best practices in the responsible use of artificial intelligence.
Limitations of Such Frameworks
While frameworks like the Foundation Model Transparency Index are vital, they have limitations. Information gathering is challenging and depends on how much detail model developers are willing to share. Many developers are reluctant to disclose information due to reasons such as proprietary concerns and legal fears. This lack of openness can hinder the effectiveness of such frameworks in providing a complete view of transparency, ethics, and best practices in AI use.
Conclusion
To address the limitations and enhance the value of frameworks like FMTI, collaborative efforts between policymakers, developers, and researchers are essential. Increasing transparency in foundational models not only fosters trust but also drives ethical and responsible AI development.
References & Credits
References
The Foundation Model Transparency Index 2310.12941 (arxiv.org)
Foundation Model Transparency Reports 2402.16268 (arxiv.org)
The Foundation Model Transparency Index v1.1 May 2024 2407.12929 (arxiv.org)
Credits
My sincere thanks to all the contributions made by various individuals, institutions and companies for their efforts in shaping the future of Generative AI & responsible use of AI.
Disclaimer
The purpose of the article is to spread awareness & education in the field of Generative AI. Views expressed in this article are personal and also based on the information available through various online resources.
Agile Coach at Allianz Technology
8 个月Insightful!