What Transparency and Other Related Terms Mean in the Context of AI Systems
Original image by Katya Guseva from Pixabay

What Transparency and Other Related Terms Mean in the Context of AI Systems

Transparency, explainability and interpretability are the terms frequently used in the context of AI governance. They are often thought of as elements of trustworthy AI. But what do they mean exactly? Let me share some thoughts.

Transparency

Transparency is a characteristic that may or may not manifest in the practices of AI systems’ designers, developers and deployers: in the way they disclose or fail to disclose information about the AI system and its outputs from others. [1, p. 15-16]

For example, when the AI system designers and developers actively involve other relevant stakeholders in the AI system design and development before it is deployed — we can say that these designers and developers are adding some transparency into their processes. When these former additionally disclose important information about how the AI system works, what are its specifications, capabilities and limitations — this adds further transparency.

Even more so — when people get to know what were the reasons behind the decisions related to selected system design, training data, model structure, intended use cases and other aspects relevant for system design, development, deployment and use.

Likewise, when the AI system deployers inform affected people or entities that an AI system is being used and when these deployers further disclose the important characteristics of the AI system or the broader AI-enabled process — we can say that these deployers demonstrate transparency.

In all cases, to promote real transparency, disclosure of the information should be tailored to the context, including the role and knowledge of the person receiving the information. This is measured by the end result: real transparency increases the recipient’s level of understanding of the underlying processes.

Transparency also aims to prevent or at least mitigate adverse outcomes that the AI system may cause, and may help those affected to seek redress.

Very importantly, transparency also enables those outside the AI system development process to estimate to what extent other AI trustworthiness characteristics (such as accuracy, privacy, security or safety, etc.) are present in that system, if at all.

Accountability

Accountability is another characteristic that may or may not be present across the AI system lifecycle. It depends on transparency and means that appropriate organisations, teams and individuals are accountable for the proper state and outcomes of the AI system across its lifecycle.

Accountability also usually presupposes that these teams and individuals are empowered, responsible and trained for respective tasks, thanks to appropriate structures — policies, processes, procedures and technical measures — that are put in place. [1, p. 23]

In a certain sense, one may also say that “Maintaining organizational practices and governing structures for harm reduction, like risk management, can help lead to more accountable systems.” [1, p. 16] This is the position encapsulated in the NIST's AI Risk Management Framework (NIST AI RMF).

However, I, personally, have issues with such phrasing: only people and organisations, that is, entities with legal agency, not artefacts such as AI systems, can be properly deemed as having — or lacking — accountability.

Explainability and Interpretability

Explainability, interpretability and transparency are related concepts that are sometimes misinterpreted and are — confusingly — often used interchangeably.

For example, as per the NIST AI RMF [1, p. 15-17]:

- Transparency relates to the practices of AI system’s designers, developers and deployers towards users, affected people and other stakeholders. It answers the question “What is happening in the system?”;

- Explainability relates to the system itself and answers the question “How the system is functioning and producing outputs, such as automated decisions?”;

- Interpretability relates to the system’s outputs and answers the question “Why the system does what it does, in particular why it has arrived at a particular automated decision?”.

Other well-cited sources, however, offer a different perspective, associating interpretability with the degree to which the machine learning model itself — passively — makes sense (is transparent) for a human observer and explainability with the actions that can be taken to make the functioning of the machine learning model clear or easy to understand [2, p. 85].

Understandability or Intelligibility

The key overarching notion here seems to be understandability (or intelligibility), which is a characteristic of a machine learning model that enables humans to understand how the model works without needing to delve into its internal structure and the algorithms involved. [2, p. 84]

Early (symbolic, expert) AI systems were relatively simple and therefore more easily understandable. However, the newer ones, based on deep neural networks, have been becoming increasingly complex and therefore opaque — hence the notion of "black-box" models. [2, p. 83]

While sometimes considered as better performing, black-box models are also limited in terms of their trustworthiness and implementability, especially for higher risk use cases, where insufficient transparency, interpretability and explainability may present a legal and/or ethical problem.

As a result, research has been increasingly concentrated on the topic of interpretable or explainable AI techniques (often shortened to XAI) that provide the following benefits [2, p. 83]:

1) improve impartiality in decision-making, in particular, by enabling better identification and removal of undesirable bias in training datasets,

2) improve robustness by ensuring better identification of adversarial influences that may be affecting model outputs, such as automated predictions,

3) ensure that only meaningful and not irrelevant (for example, legally irrelevant) variables are influencing model outputs and automated decisions.

At the same time, care should be taken when accepting any post-hoc explanations of how an AI system has arrived at a particular decision: depending on how these explanations are elaborated, they may contribute to “fairwashing”, that is, providing plausibly fair explanations for the decisions of a black-box AI model that do not reflect the reality of how the model actually works. [3]

Practical Considerations for AI Developers

Optimising for a particular trustworthiness characteristic, such as interpretability, may be innately at odds with maintaining the other(s), such as accuracy.

In any case, the particular design choice has to be substantiated and documented, so that transparency and accountability is ensured for the design choice being made.

The deliberation process on this, as well as many other tough questions, may likely require a collegial decision, for which certain frameworks, such as the one proposed by ForHumanity , advocate the formation of a specialised body, such as an Algorithm Ethics committee, that would enable to take in a more diverse set of inputs not limited to experts in data science, but also those trained in other disciplines, such as ethics and law.

Furthermore, for a developer it will be key to communicate the design choices and resulting limitations appropriately downstream across the supply chain to the system provider, deployer and possibly end users and other AI subjects as appropriate.

***

Want to know which of these terms are used, and how, in the EU AI Act? Attend my webinar “Preparing for the EU AI Act” that covers this and other key topics related to the new European AI regulation. Register today with a promo code NEWS to get 15 percent off.


References:

[1] NIST AI Risk Management Framework 1.0 (https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-1.pdf )

[2] Barredo Arrieta, Alejandro et al. 2020. “Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI.” Information Fusion 58 (October 2019): 82–115. https://doi.org/10.1016/j.inffus.2019.12.012 .

[3] A?vodji, Ulrich et al. 2019. “Fairwashing: The Risk of Rationalization.” 36th International Conference on Machine Learning, ICML 2019 2019-June: 240–52. https://arxiv.org/abs/1901.09749v3 .

Caryn L.

Sr. AI/ML Governance & Risk Management | EU AI Act, GDPR | NIST AI RMF, ISO 42001 | Senior Non-Resident Fellow, AI and Global Governance | Opinions are my own

5 个月

Curious how you advise orgs who struggle with trade offs in fine tuned diffusion models between accuracy/performance and Interpretability? Can’t have it both ways yet. What are your thoughts on local vs global explainability (policy never makes this distinction)?

要查看或添加评论,请登录

Aleksandr Tiulkanov的更多文章

社区洞察

其他会员也浏览了