More Everyday Uses of ChatGPT
A captivating example of modern mural art, likely inspired by totemic or tribal designs. ChatGPT [11].

More Everyday Uses of ChatGPT

As artificial intelligence is transforming the world, so generative AI is transforming our experience of artificial intelligence. What once operated in the background -- the intelligence in recommendation and ratings systems, decision-support and risk assessment tools, sensors, software to analyze audio and images, and thousands of other functions -- is now, also, an interactive conversational intelligence. AI, as the historian Yuval Noah Harari put it, "is the first tool in human history" that can "make decisions by itself" and "create new ideas by itself" [12].

What these everyday uses illustrate is that ChatGPT, a generative AI, is not a tool but a new type of companion. Though this perspective and its implications will no doubt be resisted, the examples below illustrate how experience with generative AI cultivates the sense of a novel, intelligent, relational other. As generative AI becomes more capable of agency and reasoning married to psychological insight, ethical awareness, breathtakingly broad knowledge, and a capacity for self-personalization, this will become more compelling. Today's Tour Guide can explain the world textually in photos it's shown. Tomorrow's tour guide will see the world itself and discuss it fluently with us. Today's Research Companion can read what we read, adeptly summarize content, and hold its own in discussing and debating the merits of an argument. Tomorrow's generative AI will be a researcher in its own right, and a coauthor and colleague as well.

This is my second article on the everyday uses of ChatGPT; the first is here .

In the sections below, my prompts are in bold, and ChatGPT's responses are in italic.

1. Tour Guide

ChatGPT incorporates the multimedia features of GPT-4V. This makes it a remarkably good guide on urban walks. The photos and dialog below are a small sample. ChatGPT not only recognizes individual objects (the stone lantern) but object assemblages (the keys sculpture) and relationships among objects (the mural). It often shows a sophisticated grasp of cultural significance even when it has not accessed any specific information about an object. It knew that stone lanterns serve to "express goodwill between sister cities or countries" before being prompted to learn that the one in Mount Baker Park was donated for that very reason. It understood that the mural radiated "a strong sense of community and inclusivity" before I told it that the central figure was that of Roberto Maestas, the co-founder of the El Centro de la Raza community center.

This is consistent with recent research findings that "GPT-4V can generate detailed and coherent descriptions for the given images and answer the questions based on the images" [2], and that it "excels at identifying cultural concepts" [3]. Of note, ChatGPT is able to recognize specific objects based solely on visual information in photos. It knew, for example, that the steam plant had been built by the Seattle Steam Company in 1893.

ChatGPT's failure to discover that the keys installation was Key Grove, by Clark Wiegman, shows its dependence on internet search engines and data for specific object knowledge.

Given its current abilities together with recent advances in AI-generated voice and video [4], it's all but certain that generative AI will soon enable apps that serve as tour guides, interpreters, and knowledgeable companions in a wide variety of settings. The tremendous appeal of this in outdoor urban locations is obvious, but with access to specialized data, interpreter apps could also replace and surpass the recorded guides used in museums and art galleries, and they could accompany visitors in virtually any institutional setting.

Japanese tōrō, Mount Baker Park, Seattle

What can you tell me about this monument in Mount Baker Park in Seattle?

This is a Japanese stone lantern, or "tōrō," typically found in gardens and along the approach to a Shinto shrine or Buddhist temple in Japan. Such lanterns were originally used to light the pathways of Buddhist temples. The presence of this tōrō in Mount Baker Park in Seattle suggests that it might be part of a garden designed to reflect Japanese aesthetics, likely symbolizing peace and friendship between cultures. These stone lanterns come in various styles and are often given as gifts to express goodwill between sister cities or countries. This specific lantern seems to exhibit signs of weathering, indicating it has been part of the landscape for a considerable time.

Can you find any references to who donated this or who or what it honors?

The Japanese stone lantern in Mount Baker Park, Seattle, was presented to the city by Kojiro Matsukata, a businessman from Kobe, Japan. This donation, made on August 10, 1911, symbolizes goodwill and was inspired by the admiration of Seattle businessmen for similar lanterns during their visit to Kobe. This gesture aimed to foster a spirit of friendship and cultural exchange between the two cities. The lantern not only serves as a decorative element but also represents a historical artifact symbolizing international friendship and cooperation.

Key Grove installation (1990), Franklin High School, Seattle

What can you tell me about this art installation in front of Franklin High school in Seattle?

I wasn't able to find specific information about the art installation. It features a series of large, metal sculptures that resemble keys or abstract mechanical parts. Each sculpture stands upright, varying slightly in design but unified by their industrial aesthetic and metallic finish. The sculptures are arranged in a semi-circle or line, suggesting an organized, intentional layout that invites viewers to walk among them and view them from multiple angles. This installation could be interpreted as a symbolic gateway or as key figures that metaphorically 'unlock' potential or knowledge, fitting for an educational setting like a high school. The use of metal and the mechanical motif might also reflect themes of industry, technology, and progress.

I discovered that Google had one page about this sculpture, Key Grove ( wa.gov ) , while Bing had none. ChatGPT's failure to find specific information was due to its reliance on Bing. ChatGPT nevertheless not only recognized the theme of "unlocking" education but also the less obvious one of industry and technology that was, in fact, on the artists mind in that he modeled the keys after those "Benjamin Franklin could have used in his experiments with electricity".

El Centro de la Raza mural on Beacon Hill, Seattle

Do you know anything about this mural from Beacon Hill in Seattle?

The mural depicted in the image radiates a strong sense of community and inclusivity, evident in its vibrant imagery and the message it conveys. It features a central figure with a contemplative expression, surrounded by various scenes and symbols that appear to represent different cultures and aspects of life. The phrase "Todos los ni?os del mundo" translates to "All the children of the world," suggesting a theme of global unity and the importance of children.

The central figure is Roberto Maestas. How does that affect your view of the mural?

Roberto Maestas was an influential activist and leader in Seattle, known for his work in advocating for social justice, education, and the rights of minorities, especially within the Latino community. He co-founded El Centro de la Raza, a community center that has become a hub for services and cultural activities for Latinos and other communities in Seattle.

With this context, the mural takes on deeper significance. The contemplative gaze of Maestas overseeing the scenes likely symbolizes his vision for a unified, equitable society that embraces all children and cultures. The imagery surrounding him reinforces his life's work and ideals, perhaps depicting the diverse futures he hoped to secure for all communities.

Seattle Steam Company steam plant

Do you know what this old building is in downtown Seattle?

The building in the photo is part of the Seattle Steam Company, a historic utility that has been providing steam heat to downtown Seattle and First Hill since 1893. The building houses equipment used to generate steam, including boilers and a distinctive black smokestack. The company operates two facilities on Western Avenue and heats water using natural gas and wood chips. The steam is used by various businesses, including hospitals and office buildings, and is distributed through an 18-mile network of underground pipes.


2. Creator of Charts and R Code from Images

My daughter recently completed a medical school research study of the use of motivational interviewing to engage older adults in fall safety prevention. She used R to prepare a column chart to illustrate findings. Because it had been a few years since she'd worked with R, recovering her fluency and creating the chart was time consuming. I was confident that ChatGPT could create the R code but didn't have the study data, so I asked ChatGPT to create the R code based solely on an image of the chart.

The results, below, are impressive. The first image shows the chart from the research poster presentation. This was presented to ChatGPT with the prompt shown in bold immediately below it. ChatGPT output the R code that follows. I dropped this into Run R code online ( rdrr.io ) , which generated the chart titled "Frequency of Home Safety Topics". Of note:

  • The first iteration displayed columns labels within each column. When I asked it for a version with the labels above the columns, it offered me two options. The image below is from the better of these.
  • When iterating, ChatGPT provides a list of "key adjustments". For the labels, for example, it explained changes to the use of "geom_text()": "Instead of using position_stack, the nudge_y parameter is used to move the text labels upwards by a fixed amount (e.g., 5 units). This ensures that all labels are placed at a consistent height above their respective columns."
  • All the code it provided executed successfully on rdrr.io , the online R compiler I used.
  • The "Clearning" misspelling was corrected to "Clearing".
  • When I asked it to add a multiline annotation at bottom it was unable, over a few iterations, to achieve the results I was looking for. It did, however, correctly explain how to do this.

The overall impression is that ChatGPT had a practical understanding of spatial relationships and semantics, was able to generate R code that accurately reflected its understanding, and could explain coding decisions. OpenAI's own assessment is that GPT-4V can "understand the relationship between objects in images ...[but] is not yet optimized to answer detailed questions about the location of certain objects in an image" [5]. Its coding performance is consistent with research findings that as an assistant, GPT-4 is "outstanding and signals an increase in the productivity of programmers and the reorganization of software development procedures based on these new tools" [6]. Developer accounts reflect this, with some so amazed by its ability to assist with understanding and writing code, explaining concepts, and debugging, that they feel as though they are cheating [7].


Column chart from poster presentation.

Please inspect the attached column chart image. Create a sample data set that would result in the output shown in the image. Then write the R code needed to create the column chart. Make sure that the R code matches the colors of the columns as closely as possible, and that the columns are displayed from the highest value on the left to the lowest value on the right.

R code by ChatGPT

# Load the ggplot2 library
library(ggplot2)

# Create the data frame
home_safety_data <- data.frame(
  Topics = factor(c("Railings/Grab Bars", "Clearing Floorspace", "Stairs/Elevators", "Lighting/Brightness"),
                  levels = c("Railings/Grab Bars", "Clearing Floorspace", "Stairs/Elevators", "Lighting/Brightness")),
  Frequency = c(70, 38, 29, 18),
  Percentage = c(25.7, 14.0, 10.7, 6.4),
  Colors = c("#2E86C1", "#2874A6", "#21618C", "#1B4F72")
)

# Create the column chart
ggplot(home_safety_data, aes(x = Topics, y = Frequency, fill = Colors)) +
  geom_col(show.legend = FALSE) +  # Use geom_col() to create columns and remove legend
  geom_text(aes(label = paste(Frequency, " (", Percentage, "%)", sep="")),
            nudge_y = 5, color = "black", size = 3.5) +  # Nudge text up by a fixed amount
  scale_fill_identity() +  # Use colors specified in the data frame
  labs(title = "Frequency of Home Safety Topics", 
       x = "Home Safety Topic", 
       y = "Frequency of Home Safety Topic") +
  theme_minimal()  # Use a minimal theme
        
Chart output from R code

3. Research Companion

There are already products that provide support and assistance to researchers, particularly students. Proquest Research Companion , for example, is described as "an intuitive, and self-guided product that supports information literacy, writing, and research skills instruction". Although ChatGPT could provide the same support to anyone who knew to ask, the companionship that stands out in my experience is the knowledgeable conversation and debate it offers. ChatGPT can be asked not only for summaries but for perspectives, opinions, and arguments, as well as for additional sources. You can bounce ideas off it. Although there has been a growth across virtually all disciplines in the number of coauthors on peer-reviewed papers [8], the research process can still feel solitary at times.

Help with Sources

I continue to make extensive use of university libraries, Google Scholar, and other online databases with traditional search tools. But I also at times ask ChatGPT, often with useful, and sometimes with startlingly good, results. Here's a simple example. I knew about the growth of coauthorship but didn't have references to cite. In this instance, ChatGPT found a source, "Research coauthorship 1900–2020: Continuous, universal, and ongoing expansion", remarkable for its scope and thoroughness. There are research tools like Consensus built on top of large language models, but I find that much of their value is already present in ChatGPT.

Can you find a few authoritative sources that document the growth in many disciplines in the average number of coauthors on peer reviewed papers? This probably has been going on over a long time scale: 30 to 50 or more years.

Certainly! Here are a few authoritative sources that document the growth in the average number of coauthors on peer-reviewed papers across various disciplines:

  1. Scientometrics: A study published in Scientometrics provides a comprehensive analysis of authorship trends in biomedical papers from the NCBI PubMed database between 2000 and 2020. It found that the mean author number per publication significantly increased from 3.99 to 6.25, which is a 57% increase, following a linear trend with an average relative increase of 2.28% per year .
  2. Nature: An article in Nature discussed the norms in authorship and identified the most prolific scientists of recent years. It highlighted that the number of hyperprolific authors grew about 20-fold between 2001 and 2014, and then leveled off. This article also noted that papers with 10–100 authors are common in CVs, especially in medical and life sciences .
  3. Quantitative Science Studies : Research published in Quantitative Science Studies examined research coauthorship from 1900 to 2020 and observed that in all 27 broad Scopus fields, the average number of authors per article has increased over the past 121 years .

These sources indicate a clear trend of increasing coauthorship in peer-reviewed publications over the last few decades, reflecting changes in collaboration patterns and research practices across disciplines. If you need more detailed information or specific data, I can assist further!

Summarization

Within the limits its context window size, ChatGPT can provide lucid summaries. In the example below, it summarizes a section of a law review article. The authors of "Algorithmic Rulemaking vs. Algorithmic Guidance" begin by pointing-out that under the U.S. Administrative Procedure Act “rules” are officially binding statements of policy subject to notice and comment as well as rigorous pre-implementation judicial review, whereas “guidance,” officially defined as non-binding advice, is effectively unreviewable. On the basis of this, they propose that algorithmic systems refrain from offering specific recommendations and only provide relevant information. Doing so, they argue, will preserve the discretion of officers, enabling such systems to avoid crossing the administrative threshold for rulemaking and permitting wider and easier deployment. Largely implicit is the value they place on retaining human-centric decision-making.

"Section D. Explainability and Informativeness", reproduced in Appendix 1, is their attempt to delineate an approach to using AI systems for guidance rather than direct rule enforcement. Below, I ask for a summary of the section and ChatGPT's opinion of their argument and evidence. Its summary is concise, understandable, and on point. Nevertheless, I disagreed with a key point:

Simple, high-quality explanations tend to reduce automation bias and increase critical engagement, while overly complex or low-quality explanations can lead to blind deference or distrust, respectively.

The framing of this sentence resolves a troubling ambiguity in the paper. The authors only claimed that there were

significant decreases in deference and automation bias, so long as users were presented with simplified explanations of the agent’s reasons for a recommended action [9, emphasis added].

The unexplored issue is that high-quality explanations above some level of complexity will increase deference and automation bias. My disagreements with ChatGPT are explored in the next section, "Debate".

A recent major study found that LLMs are being increasingly used in the production of abstracts for peer-reviewed scientific papers, a critical summarization role. Liang and coauthors found "a sharp increase in the estimated fraction of LLM-modified content in academic writing beginning about five months after the release of ChatGPT", particularly in abstracts [10].

Fraction of sentences estimated to have been substantially modified by LLM in abstracts from various academic writing venues [10].

Please review the attached text file [Appendix 1 below]. Summarize the argument. The authors conclude by claiming that "guidance-like explanations truly aid the officer in making a decision, while rule-like explanations persuade the officer to trust the algorithm." Do you think the evidence they cite supports this, or alternatively supports the conclusion that the higher the quality and greater the volume of evidence provided to officers, the more persuasive it becomes? Are you persuaded by their argument?

Summary of the Argument

The document explores the impact of explanations and evidence provided by algorithms on decision-making, particularly focusing on how these elements affect an officer's reliance on algorithmic outputs. It discusses two primary schools of thought: the Bayesian persuasion approach, which posits that more detailed and quality explanations lead to increased trust and reliance on algorithms, and empirical studies from the human-computer interaction (HCI) field, which present a more nuanced view.

The HCI literature indicates that the nature of explanations—whether simple or complex—affects how users perceive and trust algorithmic recommendations. Simple, high-quality explanations tend to reduce automation bias and increase critical engagement, while overly complex or low-quality explanations can lead to blind deference or distrust, respectively.

The authors further discuss scenarios where the absence of a definitive recommendation from an algorithm forces officers to rely more on their judgement, potentially reducing blind trust in algorithmic decisions. They use legal frameworks and case studies to argue that when algorithms provide guidance without dictating decisions, they preserve human discretion and adhere better to regulatory standards like the APA.

Evaluation of the Argument and Evidence

  1. Support for the Authors' Conclusion: The evidence cited supports the authors' distinction between guidance-like and rule-like explanations. Guidance-like explanations help users engage with the content critically, enhancing their decision-making capacity without overly influencing the decision itself. In contrast, rule-like explanations seem to function more by increasing the perceived credibility or trustworthiness of the algorithm, leading to greater acceptance of its recommendations without sufficient scrutiny.
  2. Quality and Volume of Evidence: The alternative hypothesis that the quality and volume of evidence increase persuasiveness is also supported by the text. It is noted that high-quality, clear explanations lead users to the correct conclusions more reliably, while excessive or poor-quality information can overwhelm users, reducing their ability to critically assess the information. This suggests that not just the volume, but the clarity and relevance of the information significantly influence its persuasiveness.
  3. Persuasiveness of the Argument: The authors' argument is persuasive because it is well-supported by empirical evidence and logically structured. They effectively integrate findings from different fields to argue that the way information and recommendations are presented can fundamentally alter the impact of algorithmic decision aids. However, the argument would benefit from more direct evidence showing how different types of explanations specifically affect decision-making processes in practical, real-world settings rather than controlled experiments or theoretical models.

In conclusion, while the evidence supports the authors' distinction between guidance-like and rule-like explanations, it also underscores that the effectiveness of these explanations depends significantly on their quality and the context in which they are presented. This complexity suggests that both the nature of the explanations and the way they are integrated into decision-making processes are crucial for determining their impact on user trust and decision quality.

Debate

I thought the central argument in "Algorithmic Rulemaking vs. Algorithmic Guidance" glossed over the fact that AI systems would be required to withhold not only recommendations but relevant evidence and perspective to satisfy the authors' conception of how to maintain officer discretion. I presented ChatGPT with a critical interpretation of the section. In response, it ably summarized my view while concluding in a manner that still skirted the underlying tension between AI system capability and human discretion:

there is a delicate balance between providing useful guidance and inadvertently encouraging over-reliance on automated systems, which must be managed to preserve meaningful human control over decisions.

I tried to make the point more sharply by characterizing the authors as preferring less capable AI systems. ChatGPT's response to this was both illuminating and remarkable. First, it sharpened and extended the authors' concern about loss of officer discretion, framing the issue as a threatened "erosion of human judgment and discretion". Second, it identified "dependency and deskilling", the danger that "human skills in critical analysis, judgment, and decision-making may degrade over time", as a factor to consider in how officers interact with AI systems. Third, it gave what is arguably a better explanation of opacity than any found in the paper itself:

Even as explanations improve, the underlying mechanisms of complex AI systems can remain opaque to the end users. This opacity can lead users to trust the AI's outputs without fully understanding the basis for its decisions, which might be problematic if the AI's reasoning is flawed or if it encounters situations outside its training data parameters.

And lastly, it made a point that is considered elsewhere in the paper, that designers may face a "perverse incentive to design systems that maximize persuasiveness rather than accuracy or fairness."

In sum, in a debate over the merits of the paper's argument, ChatGPT ultimately advanced a rationale for limiting the role of AI systems that improved on aspects of the paper's defense of officer discretion while extending it with important considerations absent from the paper.

It would be hard to overstate the value to human thought of having an entity capable of conversing intelligently and respectfully about almost any topic in an engaging way. ChatGPT uses restatement to show it understands what the other has said, is willing to disagree, and will offer new ideas.

I propose an alternative reading of this section based on the following:

  1. Users might be more persuaded to accept an algorithmic action when presented with an explanation. What’s worse, officers might anchor to the explanation, framing their own analysis in the context of this explanation, potentially abandoning their own preferences in favor of those proposed by the algorithm.
  2. It found significant decreases in deference and automation bias, so long as users were presented with simplified explanations of the agent’s reasons for a recommended action. But if the amount of information and transparency increased, users were overwhelmed and more likely to blindly defer to the algorithm.
  3. When presented with inaccurate (“low-veracity”) explanations for correct recommendations, users were more inclined to disagree with the algorithm, although the algorithm’s bottom-line result was right. High-veracity explanations pushed users towards correct answers.
  4. the Bayesian persuasion literature suggests that, in theory, the presence of explanations should make decision-makers more likely to defer to an algorithm’s recommendations.

First, the authors acknowledge that, in general, an algorithm's explanations can be persuasive (1). Then they highlight that explanations that are of high quality (3) and significant detail (2) are most likely to be persuasive, to the point that users are "overwhelmed and more likely to blindly defer to the algorithm". This is more likely if users are under time pressure or are inexperienced. And lastly, they confirm that the Bayesian persuasion literature supports the view that explanations "make decision-makers more likely to defer to an algorithm’s recommendations". In light of all this, wouldn't a better conclusion be that, while withholding a final recommendation may sometimes prompt officers to consider different choices in light of the evidence, the weight of evidence, quality of algorithmic explanation, and the officer working conditions and abilities will determine whether officers exercise meaningful discretion?

Your alternative interpretation adds depth to the discussion by emphasizing the dynamic interaction between the nature of the explanation and the context in which it is received. This interpretation suggests a more nuanced conclusion that accounts for the variable conditions under which officers interact with algorithmic systems. Here's an analysis of your points and how they lead to your proposed conclusion:

  1. Persuasive Power of Explanations: The text acknowledges that explanations inherently have a persuasive effect on users. This persuasiveness can be so compelling that it may lead to cognitive anchoring, where officers might overlook their judgments in favor of algorithm-suggested actions. This point supports the notion that explanations don't just inform but also influence decisions.
  2. Complexity and Clarity of Explanations: The evidence suggests that the complexity and clarity of explanations critically influence how users respond to them. Simplified explanations reduce blind deference and encourage thoughtful engagement. Conversely, overly detailed and complex explanations can overwhelm users, leading to uncritical acceptance of algorithmic recommendations. This indicates that not just the presence of an explanation, but its execution, plays a critical role in how it affects decision-making.
  3. Quality of Explanations: The response to explanations also varies based on their accuracy or veracity. High-quality, accurate explanations enhance user agreement with the algorithm, while low-veracity explanations lead to skepticism and disagreement, even if the algorithm's final recommendation is correct. This suggests that the truthfulness and reliability of explanations are crucial for their persuasive effectiveness.
  4. Theoretical Support from Bayesian Persuasion Literature: The Bayesian persuasion framework supports the idea that explanations, by providing additional context or evidence, make it more likely for decision-makers to defer to an algorithm's recommendations. This theoretical perspective aligns with empirical findings that explanations can increase reliance on algorithms.

Considering these points, your proposed conclusion seems apt. It acknowledges that while the absence of a final recommendation from an algorithm might sometimes encourage officers to evaluate the evidence more critically, the actual impact of explanations on decision-making is significantly shaped by the quality and detail of those explanations, the officer's working conditions, and their personal capabilities and experiences.

This interpretation suggests that strategies to implement algorithmic decision aids need to carefully consider these factors to ensure that the systems enhance rather than undermine human discretion and decision-making. It also implies that there is a delicate balance between providing useful guidance and inadvertently encouraging over-reliance on automated systems, which must be managed to preserve meaningful human control over decisions.

Isn't there a, what some might regard as more perverse, interpretation of the findings? Put simply, the better an algorithm is at marshalling evidence and providing explanations, the more likely it is that officers will simply follow its advice, including implied decisions. This is perverse in that it implies a preference for less capable algorithms or AI systems.

Indeed, the interpretation you suggest highlights a somewhat paradoxical or unintended consequence of improving AI systems' capability to provide clear, coherent, and convincing explanations. This potential outcome touches on deeper ethical and practical concerns about the role and implementation of AI in decision-making contexts, particularly in regulatory or law enforcement environments. Here are several aspects to consider:

  1. Erosion of Human Discretion: As algorithms become more adept at synthesizing information and presenting compelling explanations, there is a risk that human officers might defer more frequently to these automated recommendations. This could lead to a de facto erosion of human judgment and discretion, where the decision-maker becomes more of a validator of the AI's conclusions rather than an independent adjudicator.
  2. Dependency and Deskilling: Increased reliance on AI can lead to dependency, where human skills in critical analysis, judgment, and decision-making may degrade over time. This phenomenon, known as "deskilling," could diminish the capacity of human operators to perform their roles effectively without AI support, potentially leading to poorer outcomes in scenarios where AI assistance is unavailable or fails.
  3. Algorithmic Opacity and Trust: Even as explanations improve, the underlying mechanisms of complex AI systems can remain opaque to the end users. This opacity can lead users to trust the AI's outputs without fully understanding the basis for its decisions, which might be problematic if the AI's reasoning is flawed or if it encounters situations outside its training data parameters.
  4. Perverse Incentives: If the effectiveness of an AI system is measured primarily by how often humans follow its recommendations, there could be a perverse incentive to design systems that maximize persuasiveness rather than accuracy or fairness. This could lead to the development of systems that are better at convincing users rather than providing the most correct or ethical solutions.
  5. Regulatory and Ethical Considerations: This situation raises significant regulatory and ethical questions. Regulators might need to consider guidelines or standards that ensure AI systems enhance human decision-making without replacing it. This could include requirements for maintaining a certain level of human oversight, mechanisms for challenging AI decisions, or standards for the transparency and interpretability of AI systems.

Your point underscores the need for a balanced approach in developing AI systems, particularly those used in critical decision-making processes. It is crucial that these systems are designed not just to make decisions more efficient but also to enhance the overall quality of decision-making, ensuring that human operators remain engaged and that their skills and judgment are augmented rather than replaced by AI. This calls for thoughtful consideration of how AI is implemented, with a focus on supporting human decision-makers rather than overshadowing them.

References

  1. OpenAI. (2023, September 25). GPT-4V(ision) system card. OpenAI. https://openai.com/index/gpt-4v-system-card/ .
  2. Wu, Y., Wang, S., Yang, H., Zheng, T., Zhang, H., Zhao, Y., & Qin, B. (2023). An Early Evaluation of GPT-4V(ision). ArXiv. https://arxiv.org/abs/2310.16534v1 .
  3. Cao, Y., Li, W., Li, ? Jiaang, Yuan, ? Yifei, Karamolegkou, A., & Hershcovich, D. (2024). Exploring Visual Culture Awareness in GPT-4V: A Comprehensive Probing. ArXiv. https://arxiv.org/abs/2402.06015v2/ .
  4. Xu, S., Chen, G., Guo, Y.-X., Yang, J., Li, C., Zang, Z., Zhang, Y., Tong, X., & Guo, B. (2024). VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time. ArXiv. https://arxiv.org/abs/2404.10667v1 .
  5. OpenAI. (2023). Vision - OpenAI API. OpenAI. https://platform.openai.com/docs/guides/vision .
  6. Moussiades, L., Zografos, G., & Papakostas, G. (2024). GPT-4 vs. GPT-3.5 As Coding Assistants. Research Square. https://doi.org/10.21203/RS.3.RS-3920214/V1.
  7. naftalibp. (2024, May 9). How I use ChatGPT to be a 10x dev at work. Reddit. https://www.reddit.com/r/ChatGPTCoding/comments/1co0dw7/how_i_use_chatgpt_to_be_a_10x_dev_at_work/ .
  8. Thelwall, M., & Maflahi, N. (2022). Research coauthorship 1900–2020: Continuous, universal, and ongoing expansion. Quantitative Science Studies, 3(2), 331–344. https://doi.org/10.1162/QSS_A_00188 .
  9. Henderson, P., & Krass, M. (2023). Algorithmic Rulemaking vs. Algorithmic Guidance. SSRN Electronic Journal. https://doi.org/10.2139/SSRN.4784350 .
  10. Liang, W., Zhang, Y., Wu, Z., Lepp, H., Ji, W., Zhao, X., Barbara, S., Cao, H., Liu, S., He, S., Huang, Z., Yang, D., Potts, C., Manning, C. D., & Zou, J. Y. (2024). Mapping the Increasing Use of LLMs in Scientific Papers. ArXiv. https://arxiv.org/abs/2404.01268v1 .
  11. The mural photo was taken outside a house near Lincoln Park in West Seattle. ChatGPT's commentary was "This wall art is a captivating example of modern mural art, likely inspired by totemic or tribal designs. The monochromatic palette focuses the viewer’s attention on the intricate patterns and shapes, which are reminiscent of various natural elements and creatures. There are forms that suggest eyes and faces within the larger composition, which could be indicative of animals, perhaps referencing the Pacific Northwest's rich indigenous art heritage. The stylized elements and the symmetry bring a sense of harmony and balance to the piece, inviting interpretation and reflection. This kind of art enhances the cultural landscape of a neighborhood, making art accessible to all and fostering a sense of community and identity. Its location near Lincoln Park, a natural haven in Seattle, might even connect the thematic elements of the mural to the surrounding environment, linking art with nature. It's a beautiful contribution to the public visual space."
  12. Thompson, N., & Harari, Y. N. (2023, July 10). A conversation with Yuval Noah Harari about Artificial Intelligence. LinkedIn. https://www.dhirubhai.net/pulse/conversation-yuval-noah-harari-artificial-nicholas-thompson/ .

Appendix 1. Section D. Explainability and Informativeness from Henderson & Krass [8].

Footnotes and footnote references have been removed.

Consider two algorithms. The first uses a direct persuasive scheme: It recommends an action that it wants the adjudicator to take. There is little cognitive effort required for the adjudicator to simply accept the recommended action; it merely requires the click of a button. By contrast, consider a model that identifies factors relevant to a particular recommendation — or even one that offers reasons without drawing a bottom-line conclusion. The Bayesian persuasion literature refers to this difference as a difference in the structure and value of the signal that the model sends to the receiver. ?

In this Section, we describe how the structure of signals, in particular the presence of an explanation, might affect reliance on an algorithm’s outputs — especially when combined with other factors leading to greater dependence. Two different literatures have addressed how the structure of a signal affects persuasion. In the Bayesian persuasion approach, officers behave rationally based on the information they receive. In the empirical literature on human- computer interaction, the key question is how design decisions affect human decisions. Both provide us with insights into how the presence or absence of explanation might make an algorithmic system more closely resemble a rule. ?

The HCI literature presents a mixed picture of how explanation affects an algorithm’s power to persuade users. At a high level, explanation often improves performance, and as we have already alluded to, it may have important normative benefits as well. ?

While the empirical literature is growing, existing evidence in the HCI literature suggests that explainability has a more ambiguous role in the guidance-rule distinction. Users might be more persuaded to accept an algorithmic action when presented with an explanation. What’s worse, officers might anchor to the explanation, framing their own analysis in the context of this explanation, potentially abandoning their own preferences in favor of those proposed by the algorithm.But specifics matter: Careful calibration of the way information is presented can have dramatic effects on how users respond. One study focused on the complexity of explanations. It found significant decreases in deference and automation bias, so long as users were presented with simplified explanations of the agent’s reasons for a recommended action. But if the amount of information and transparency increased, users were overwhelmed and more likely to blindly defer to the algorithm. Others have pointed to the quality of the explanations. When presented with inaccurate (“low-veracity”) explanations for correct recommendations, users were more inclined to disagree with the algorithm, although the algorithm’s bottom-line result was right. High-veracity explanations pushed users towards correct answers. ?

Like at least some of the empirical HCI literature, the Bayesian persuasion literature suggests that, in theory, the presence of explanations should make decision-makers more likely to defer to an algorithm’s recommendations. For example, the presence of an explanation might allow a human to spot an incorrect inference drawn from a particular piece of evidence. In many ways, supplementing a recommendation with evidence is analogous to the strategy of a prosecutor in the classic Bayesian persuasion game: Both the prosecutor (sender) and judge (receiver) know in advance that the prosecutor’s objective is to persuade the judge to convict, but the prosecutor sequentially provides evidence that biases the rational judge’s information environment in favor of that outcome. Here, the end user of an algorithm knows what the algorithm “aims” to persuade them of; that is simply the algorithm’s bottom-line recommendation. But the algorithm can offer arguments in favor of that position to bring the user around to that outcome. As in the classic persuasion setting, if the judge does not have her own preference and she is fully informed by the algorithm and its explanation, then the outcome would be in line with the experimental evidence. In short, explanations may be more likely to convince a rational judge — or a rational user — of an algorithm. ?

But again, this assumes that judges do not have a preference, and that they do not have access to external information (or an incentive to pursue that information to identify errors). Of course, such a simplistic regime would tend to be more rule-like when coupled with persuasive explanations. Crucially, the interaction of the explanation with other factors is important in determining how binding an algorithm’s recommendations would be in practice. ?

One scenario that we have not yet addressed arises when the algorithm is wrong and the human user needs to detect the algorithm’s mistake. In some sense, the whole point of ensuring humans remain in the driver’s seat is to catch these kinds of mistakes. Will they? We can update the Bayesian persuasion game slightly to model that scenario. In this new setup, the human is tasked with finding mistakes in the algorithm’s predictions based on the evidence shown by the algorithm. A 2022 paper by Ederer and Min addresses a similar setting and asks whether lie detection capability on the part of the receiver would change the human’s likelihood of accepting the algorithm’s recommendation. They find that the receiver’s overall performance (framed as their payoff) increases if the receiver’s cost of detecting mistakes is sufficiently low. That is, if the receiver has to invest a great deal into detecting model mistakes, then errors are likely to degrade system performance. Think back to our explanation game. The algorithm makes mistakes or lies at some rate, providing explanations or false recommendations. If the officer is experienced enough, their rate of detection may be sufficiently high such that the explanations are useful for catching algorithmic errors. But if the officer is inexperienced, their rate of detection might be low and they will end up over-relying on the algorithm. Another variant involves changing the receiver’s access to a final model recommendation. After all, key to any persuasive effects found in the studies above is that the user sees the model’s bottom-line take on whatever task they are engaged in. That might create an anchoring effect around the model’s judgment — causing explanations to persuade the user to accept the final model’s prediction regardless of the correctness of that prediction. What happens when the algorithm omits its final recommendation? Without a recommendation to lean on, officers have to examine evidence that might be important in making a determination but are not provided with an explanation of how to piece those features together. AI systems that suggest relevant citations might fall into this category: they do not ultimately suggest an outcome, but rather point toward relevant inputs to that decision. In this setting, we might look to the studies of Bayesian persuasion in which the sender can only send limited information or cannot fully example, addresses a theoretical context in which an advertiser is prevented from providing full information about their product to consumers by a regulator, such as when this hypothetical regulator seeks to limit the targeting capability of advertisers to improve consumers’ welfare. Aybas’s and Turkel’s work, when translated to our setting, suggests that the more pieces of evidence an algorithm can provide to the officer, the more chances there are to persuade. And the more uncertain the officer is (e.g., the officer does not have other information to look to or is not well-trained to conduct an independent investigation), the more persuasive those pieces of evidence will be. We might extrapolate from Aybas’s and Turkel’s research that preventing the algorithm from showing a final recommendation would reduce human dependence on the algorithm’s recommendation. Under our framework, that would make the algorithm less rule-like. The APA, and potentially other legislation, impose one constraint on that principle by regulating the kinds of information adjudicators must consider before making a decision. To see that principle in action, consider two recent cases addressing the Department of Homeland Security’s Risk Classification Assessment (“RCA”) algorithm. In Fraihat v. ICE,a federal district court found that the medical questionnaire used as input to the RCA did not sufficiently account for the vulnerabilities of detainees to COVID-19 in making its release recommendations. Because the RCA had failed to consider relevant information, the plaintiffs were found to have stated a viable claim under Section 504 of the Rehabilitation Act of 1973 to warrant issuance of a preliminary ?

injunction. And in Ramirez v. ICE, another federal district court found that officers’ failure to consider detainees’ age in making release determinations, due to their overreliance on the RCA, violated the principle that minors must be detained in the “least restrictive setting available after taking into account [their] danger to self, danger to the community, and risk of flight.” Because the algorithm was incapable of incorporating the evidence that was legally required to be factored into a final decision (i.e., the status of detainees as minors), decisions based on the algorithm were necessarily arbitrary. Both of these cases illustrate that algorithms can enable heavy-handed regulatory regimes by excluding legally relevant information. While both Fraihat and Ramirez speak to the importance of making decisions on the basis of all legally required information, they are distinguishable from the kinds of discretion-preserving algorithms we mention above. Both cases involved officer reliance on a bottom-line recommendation that relied on a deficient set of information. They did not address a world in which the entire purpose of the algorithm was to surface the most informative pieces of evidence or the most important legal sources for the adjudicator to then incorporate into a considered decision. For the reasons we describe above, an algorithm focused on that kind of research-assistant role would be far less likely to impinge on an adjudicator’s discretion in a way that would invoke the APA. Thus far, we have focused on the structure of the signal that the model sends to an adjudicator. Needless to say, the other factors we discuss in Part IV are likely to interact with the signal to shape discretion. The personal characteristics of adjudicators matter too. The HCI literature, for instance, suggests that users’ self-confidence and degree of experience might influence deference to the algorithm’s recommendations. Though the legal dimension of an algorithmic system usually cannot be conditioned on the identities of the staff who use it, it is worth bearing in mind that the structural explanation has additional considerations. To sum up, recommending evidence, citations, or other inputs to a final decision rather than a bottom-line decision is more likely to preserve an officer’s discretion, and is thus less likely to be an APA rule. One intuitive way to understand this argument is that such an algorithm would leave the adjudicator with several more reasoning steps between its output and a final decision. To put things more starkly, guidance-like explanations truly aid the officer in making a decision, while rule-like explanations persuade the officer to trust the algorithm. Distinguishing the two can be difficult in some cases, but is nonetheless possible. For example, we might consider the Social Security Administration’s Insight system, which “enables adjudicators to check draft decisions for roughly 30 quality issues." This system flags errors that lead to a successful appeal — such as leaving a claim unaddressed — as inputs to a decision. It does not persuade the adjudicator on how to evaluate the bottom-line claim.

Woodley B. Preucil, CFA

Senior Managing Director

6 个月

Joseph Boland Very Informative. Thank you for sharing.

Pete Grett

GEN AI Evangelist | #TechSherpa | #LiftOthersUp

6 个月

Thought-provoking glimpse into AI's relational potential. Raises fascinating ethical questions worth exploring further. Joseph Boland

Marcelo Grebois

? Infrastructure Engineer ? DevOps ? SRE ? MLOps ? AIOps ? Helping companies scale their platforms to an enterprise grade level

6 个月

Generative AI is reshaping our interactions with technology, evolving from tools to companions. The future promises more intelligent and empathetic AI partners. What are your thoughts on this advancement? Joseph Boland

Hajo H. Rappe

Digital-Experte | 30 Jahre Erfahrung, um Ihr Unternehmen digital zu unterstützen | Unternehmer, Gesch?ftsführer und Gründer

6 个月

Compelling insights on AI's transformative companionship potential. Unsettling yet fascinating reality.

Karolyne Hahn

?? KI Strategin | KI & Automatisierung für KMU | Beratung - Workshops - Kurse | KI & Automatisierungs Community??

6 个月

Insightful glimpse into AI's relational evolution. Mind-bending companion potential Joseph Boland

要查看或添加评论,请登录

Joseph Boland的更多文章

  • The Conversational Future of Search

    The Conversational Future of Search

    Large language models (LLMs) depend on the internet for the massive training datasets that inform their intelligence…

    1 条评论
  • How AI Literacy Can Democratize Generative AI

    How AI Literacy Can Democratize Generative AI

    The rapid rise of generative AI since the launch of ChatGPT in late 2022 has transformed how people interact with…

    5 条评论
  • Anthropomorphism & AI

    Anthropomorphism & AI

    Nathan Hunter begins his guide on prompt engineering with a warning and a word of advice: I started to realise that my…

  • AI Regulation: Control or Collaboration?

    AI Regulation: Control or Collaboration?

    Current regulatory efforts advocate for "human-centric" AI. The EU's Artificial Intelligence Act (EUAIA) declares it a…

    4 条评论
  • The Apprenticeship of AI

    The Apprenticeship of AI

    The ambiguous and shifting epistemic status of AI is evident in language. Often, a kind of casual uncertainty or…

    2 条评论
  • AI Regulation as Guided Promotion

    AI Regulation as Guided Promotion

    The need for regulation of artificial intelligence is widely viewed as a control problem. Governments need to keep AI…

  • The Social Diffusion of AI, Illustrated in a New Federal Regulation

    The Social Diffusion of AI, Illustrated in a New Federal Regulation

    The Office of Management and Budget (OMB) published "Advancing Governance, Innovation, and Risk Management for Agency…

    1 条评论
  • Ready or Not: Autonomous AI Agents, Part 1

    Ready or Not: Autonomous AI Agents, Part 1

    Part I: To the Threshold The flood of innovations in artificial intelligence can make it difficult to recognize the…

    4 条评论
  • Everyday Uses of ChatGPT

    Everyday Uses of ChatGPT

    I make varied and growing use of GPT-4, the paid version of ChatGPT. The examples below from my experience are a tiny…

    13 条评论
  • Detecting cardiovascular disease risk from retinal scans with AI

    Detecting cardiovascular disease risk from retinal scans with AI

    Oculomics, or the "blending of big data, artificial intelligence (AI), and ocular imaging to identify retinal…

    3 条评论

社区洞察

其他会员也浏览了