Shedding Light on AI Transparency in Foundational Models
Image is generated using Microsoft Designer

Shedding Light on AI Transparency in Foundational Models

Introduction

This weekend, the research paper "The Foundation Model Transparency Index v1.1," published in May 2024, caught my attention. I have been raising similar questions in various forums about the need for transparency regarding foundational models' information across different parameters. In this article, we will explore this topic together, guided by insights from this research report by contributors.


What is Transparency with Respect to Foundational Models?

Foundational models have become an integral part of the generative artificial intelligence ecosystem, serving as the backbone for many applications in this field. Given their significance, it is essential that developers and distributors clearly disclose the models' characteristics, data sources, and other attributes to consumers and end users.

Unfortunately, transparency has often been lacking, placing foundational model developers under scrutiny for not providing sufficient information about their models.


The Foundation Model Transparency Index (FMTI)

The Foundation Model Transparency Index (FMTI) was first introduced in October 2023 by Bommasani. The first iteration (FMTI v1.0) scored ten major foundation developers (e.g., OpenAI, Google, Meta) based on publicly available information regarding 100 transparency indicators.

These 100 indicators comprehensively characterize transparency for foundation model developers. Indicators are divided into three broad domains as described in Figure 1 below. indicators that are upstream of the model, indicators that relate to the model itself, and indicators that are downstream of the model.

Figure 1
A conceptual depiction of the foundation model supply chain, beginning with the primary upstream resources (i.e. data, compute) and transitioning to the foundation model, subsequent hosts (or distribution channels), and ending with downstream applications

Image source & credit - What is a foundation model? | Ada Lovelace Institute

Transparency Indicators

These indicators span matters such as the data, labor, and compute used to build models; the capabilities, limitations, and risks associated with models; and the distribution of models as well as the impact of their use.

The upstream indicators identify the ingredients and processes involved in building a foundation model. There are 32 upstream indicators.

The model indicators identify the properties and function of the foundation model. There are 33 model indicators

The downstream indicators identify the use of the foundation model, including details about its release. There are 35 downstream indicators.

The 100 indicators of the Foundation Model Transparency Index spanning the 3 domains: upstream, model, and downstream

Overall score of 2023

The overall Foundation Model Transparency Index score and ranking across all 100 indicators.

Score by domain, 2023

The aggregate score of each developer broken down by the three domains: upstream, model, and downstream.

Scores by Major Dimensions of Transparency, 2023

The fraction of achieved indicators in each of the 13 major dimension of transparency. Major dimension of transparency are large subdomains within the 23 subdomains
Complete FMTI 2023 report can be read at 2310.12941 (arxiv.org)

The Foundation Model Transparency Index (FMTI) v1.1

To understand how the landscape has evolved in the last six months, a follow-up study (FMTI v1.1) was conducted. The same 100 transparency indicators were retained to enable direct comparison.

This time, instead of relying on public information, developers were requested to report relevant information for each indicator. Out of the 19 contacted foundation model developers, 14 provided reports related to the 100 transparency indicators.

The FMTI v1.0 and v1.1 overall scores for the eight developers assessed in both versions


The FMTI v1.1 results demonstrate ample room for improvement, as well as tangible improvements in transparency over half a year. On average, developers disclosed information that satisfied 58 of the 100 transparency indicators.

The fraction of achieved indicators in each of the 13 major dimension of transparency. Major dimension of transparency are large subdomains within the 23 subdomains.

Transparency increased in subdomains that were especially opaque in v1.0

Several of the areas of the index that were least transparent in v1.0 show significant improvement in v1.1, including subdomains such as compute, methods, risks, and usage policy. For example, the total score across companies for the compute subdomain rose from 17% in v1.0 to 51% in v1.1. Compute is potentially one of the most intractable areas for disclosure as it relates to the environmental impact of building foundation models—and therefore could be associated with additional legal liability for developers and deployers—yet we see improvement across compute indicators.

The least Transparent subdomains

Data access, impact, and trustworthiness are the least transparent subdomains. Developers score 50% or less on 10 of 23 subdomains in the index, including 3 of the 5 largest subdomains impact (15%), data (34%), and data labor (50%). The lack of transparency in these subdomains shows that the foundation model ecosystem is still quite opaque.

There is little information about how people use foundation models, what data is used to build foundation models, and whether foundation models are trustworthy.

These low scores reflect the ongoing crisis in data provenance where companies share no information about the license status of their datasets, preventing downstream developers from ensuring they are complying with such licenses.

Why Frameworks Like FMTI Are Important

Frameworks like the Foundation Model Transparency Index play a crucial role by offering a consolidated view of various foundational models. They assess different transparency indicators, aiding decision-making for individuals and enterprises. Such frameworks are essential, and the development of more frameworks can further enhance transparency, ethics, and best practices in the responsible use of artificial intelligence.

Limitations of Such Frameworks

While frameworks like the Foundation Model Transparency Index are vital, they have limitations. Information gathering is challenging and depends on how much detail model developers are willing to share. Many developers are reluctant to disclose information due to reasons such as proprietary concerns and legal fears. This lack of openness can hinder the effectiveness of such frameworks in providing a complete view of transparency, ethics, and best practices in AI use.

Conclusion

To address the limitations and enhance the value of frameworks like FMTI, collaborative efforts between policymakers, developers, and researchers are essential. Increasing transparency in foundational models not only fosters trust but also drives ethical and responsible AI development.

References & Credits

References

The Foundation Model Transparency Index 2310.12941 (arxiv.org)

Foundation Model Transparency Reports 2402.16268 (arxiv.org)

The Foundation Model Transparency Index v1.1 May 2024 2407.12929 (arxiv.org)

What is a foundation model? | Ada Lovelace Institute

Credits

My sincere thanks to all the contributions made by various individuals, institutions and companies for their efforts in shaping the future of Generative AI & responsible use of AI.

Disclaimer

The purpose of the article is to spread awareness & education in the field of Generative AI. Views expressed in this article are personal and also based on the information available through various online resources.


Nisha Sibi

Agile Coach at Allianz Technology

8 个月

Insightful!

要查看或添加评论,请登录

Anshul Kumar的更多文章