How ChatGPT/LLM Will Change Human-Computer Interaction (HCI) Design Forever.
Charles L Mauro CHFP
Founder and President of Mauro Usability Science / Neuroscience-based Design Research / IP Expert
Every Year At This Time, our research team creates a detailed article on the worst human factors and usability solutions of the past 12 months. This is a popular post with many views. Our preliminary worst-of-2022 list included the Slack UI update, Meta's Metaverse, Tesla's autonomous driving system, and the Google AdWords Interface.
Hmmm...What Is This? While producing our analysis, ChatGPT3 showed up. After extensive poking and prodding, we concluded that ChatGPT and other similar large language models (LLMs) were breakthrough applications that could transform how we work with all manner of words. However, that finding was not our core insight...there is more to LLM than the first impression.
The most significant impact of LLM technology will be on the UX/HCI design for a particular type of software interface known in human-computer interaction research as "Complex System-Operation Interfaces" or (CSOI). Our world economy runs on CSOI interfaces. It turns out that Google AdWords is such an interface. Here is our supporting data using AdWords as an example.
A New Front End and Back End: These new LLMs will fundamentally change how user interfaces are designed for complex software applications in the future. Combined with code generation repositories like CoPilot and HCI design, a new era of dialogue-based human-computer interaction (HCI) is upon us. The change will have a massive impact on HCI/UX design as a discipline. Surprisingly, the Google version of ChatGPT LLM, known as BARD, will be more critical as a new interaction framework for Google AdWords than a text-based content creation tool. Yes, you read that right!
How LLMs Solve A Fundamental UX/HCI Interface Design Problem: Working from the vantage point of more than 40 years of conducting professional human factors research and creating human-computer interaction solutions for truly complex and high-risk systems, I have come to understand that user interface design problems fall under three categories:
Category 1: Simple Task-Based Interfaces (STBI): These are simple software interfaces like Instagram, Facebook, Twitter...etc. Higher usability levels are achieved through simple task flows that are highly repetitive and follow a given platform's SDK rules. A typical example is the limited functionality in Apps for iOS and Android. Generally, these interfaces have low interaction risk but a very high frequency of use. They are simple to design but difficult to improve. These interfaces are successful based on massive levels of repeated use, which also explains their often addictive nature...that is another story.
Category 2: Search and Select Interfaces (SSI): These are more complex than Category 1 because they combine directed search with content analysis and often purchase decision management. A typical example is Amazon.com or another e-com-directed site that requires multiple task flows and decision vectors to achieve reasonable levels of success for the user. Achieving an acceptable level of usability in these systems involves matching mental models and paying careful attention to navigational structure and feedback. They also benefit from extensive repeated use by customers.
Category 3: Complex System-Operation Interfaces (CSOI): This final category of user interface design makes possible interaction with complex technological systems and products that contain a wide range of input variables that must be manipulated and managed by the user in real-time, which then receive system status feedback in the form of complex data visualization displays. This type of user interface design solution routinely requires high levels of user training to achieve business success and safe operation. Examples include air traffic control, stock trading systems, process control, and, somewhat surprisingly, an increasing number of business applications, most specifically Google AdWords (GA), which, as we shall see, has become a massive usability problem for Google. This category of user interface design can be revolutionized by using LLM systems linked to underlying code that drives system operation based on the utilization of a complex task-based training set that overlays a text-dialogue interface with underlying code executing tasks based on text input from the user.
Profound Improvements in Category 3 (CSOI) HCI Design: We realized that under the ChatGPT framework lurks a massive change in the UX/HCI Design for modern complex software applications. A change as profound as Douglas Engelbart's discovery of the GUI and the Desktop Metaphor. Here is what we mean by way of one of the examples we had selected for our worst human factors solutions of 2022: The Google AdWords (GA) user interface.
GA Usability Problem Is Not New Ground: For over a decade, Google AdWords has been on a journey into usability hell. We noted this problem in a 2016 article. To be clear, Google has yet to have a real incentive to fix the massive usability problem because it has a near-perfect monopoly in online advertising. This high level of complexity likely works in Google's favor because, in many instances, users of GA are stuck having to utilize GA even though they have, at best, sketchy control of how their funds are being allocated and the performance of their ads. This factor should be considered in the current U.S. government Antitrust case against Google. This may be a critically important variable for demonstrating that Google is a rampaging monopoly with no incentive to optimize customer costs or improve the core usability of GA. However, truly challenging usability also works against Google in less apparent ways.
On the profit side of the equation, this turns out to be a non-trivial problem for GA because massive usability problems come with the well-understood loss of business in its primary business channel, online advertising. The persistent GA usability problem impacts businesses worldwide because they still need to take advantage of the original GA functionality that was a game-changer early on and an exemplar of simplicity. What do we mean by this?
Sink or Swim: Today, unless a business employs a team of AdWords experts or expends significant funds on an AdWords agency, it is effectively dead in the water. Asking for help from the Google Ad support team is a total lost cause as Google's response time for help is now measured in months. Google is aware of this problem as it has tossed into the mix an army of "customer support representatives" (CSR) designed to walk customers through certain minefields in the GA application. Even in training sessions, the GA CSR cannot usually figure out how to optimize a GA account except by increasing the spending limit on one's credit card. This amounts to Google simply having more funds with even less user control. Based on our experience, employee turnover on the GA support team is measured in days. It's too bad Google did not spend the billions it tossed into the cash incinerator on hardware design and instead fixed AdWords. On the other hand, thermostats and smartphones are a lot more fun but vastly less profitable.
GA Profits: This does not bode well for the "ATM Stuck On Withdrawal" known as Google AdWords. So bad is the current usability that our team has effectively stopped nearly all use of the system, probably to our deficit, not to mention lost revenue to Google. Today, there is no way to easily set up, test, optimize, and productively manage a typical small or even medium-sized business AdWords account within an acceptable level of complexity. Even with epic usability problems, millions of AdWords customers still hand over their credit cards and hope for the best. Almost certainly, the problem is impacting GA's profitability and market penetration as it loses ad revenue to all new entrants. How did this happen, and what does it have to do with ChatGPT, Douglas Engelbart, and the demise of the traditional Graphical User Interface (GUI)?
Mental Models are the Heart of UX/HCI: At the heart of the GA usability problem is a human-computer interaction (HCI) design concept known as mental model matching (M3). This formal research methodology is known in human factors science as the "transfer of learning"" When one sets out to design a UI for a complex application, the new system must match the existing mental model of users interacting with the new system OR the system design must include massive user training required to construct a workable mental model in the mind of the user. UX designers and software development managers rarely consider this critical research. Instead, they push ahead, creating page-based user interaction flows based on their own intuition of what will be easy to use. This approach is widespread and devastating in terms of usability. Here is why.
Douglas Engelbart's Masterpiece: Engelbart employed the user's prior learning from manipulating objects in the real world and applied them to control of objects in a screen-based system. The idea was the transfer of learning for manipulating objects by grasping, moving, and dropping things to achieve a task...all manner of tasks. Engelbart transferred this highly routinized and practiced behavior into computer screen behavior through a simple mouse click and hand movement. So powerful was this concept that users the world over got it in a few minutes. The rest is history, as Xerox copied and refined Engelbart's concept. Apple then copied Xerox, and Microsoft captured the GUI from Apple. By the mid-1990s, the GUI was the primary interaction model for all consumer-facing computers and complex process control systems. Billions of users exercised the Engelbart GUI design. But then everything changed.
HTML and Scrolling: Just as the Engelbart GUI came to dominate human-computer interaction theory and design, the internet emerged with an entirely new set of user interaction problems. These legendary problems resulted from the introduction of a new interaction framework based on "Pages"...billions and billions of pages connected by links and scrolling. Given the limitations of web software, a robust GUI in the vein of Engelbart was no longer possible or desirable.
Pages and More Pages: In a complex application like GA, approximately a million possible task flows are connected through thousands of individual "Pages." Each "Page" has complex selection elements, graphics, data input fields, popup overlays, and real-time updates. In the best case, these elements flow together based on user input to create an interface to the underlying functionality of the GA system. Generally speaking, a rough form of main high-level navigation attempts to stitch the core GA functionality together for the user to navigate between significant functions at higher content levels. I note that this high-level structural navigation is missing entirely in GA.
Not An Expert? Too Bad: This leaves anyone who is not a GA expert on a random walk across the interface, hoping to find a solution to a given problem or feature. In addition to the pages that capture data input by the user, there are hundreds of information display pages, alerts, help content, and account status screens. If a company utilizes the advanced features of GA, the number of screens balloons into the thousands.
It is hard to feel sorry for Google at this point since they likely have the most significant number of UX designers in tech, not to mention hire hundreds of newly-minted PhDs for positions in user research, even though the vast majority of such hires had virtually no experience in human factors research.
The Trade-Off: Creating a highly usable user experience for the AdWords front end requires extensive levels of cognitive modeling, task analysis, root cause error analysis, user variability analysis, and massive levels of independent user testing during all phases of development. The primary metric for AdWords in the context of business improvement is the relationship between the user's cognitive workload vs. AdWords delivery of validated leads and new business to its customers. Think of this as a trade-off between mental effort vs. business benefit. Currently, this model needs to be more balanced regarding cognitive complexity. So what does this have to do with ChatGPT or similar AI platforms?
领英推荐
Given the total number of use cases that AdWords customers need access to and the number of interactions with the Google AdWords backend, it makes perfect sense to NOW produce for Google AdWords a dialogue LLM UI that almost any small business can utilize to set up, configure, test, validate, optimize, and upgrade their Google AdWords account. Here is an example.
Example Task: Run a New Ad on AdWords. Here is a task the user of GA would like to achieve. This is a task from our firm's use of GA.
Advertising Campaign Objective Statement: "Create and run an ad for "Usability Testing Services For Medical Devices." The ad should run only Monday through Friday between 9 am and 6 pm Eastern Time, with a total cost per week of $500 billed to our corporate Visa card. Send an alert when 75% of the weekly fee is expended and allow for an automatic increase in ad spending budget during the week based on receipt of the alert. Provide a daily summary of ad performance based on standard KPIs. Deliver the report to our CEO, CFO, and CMO by 12 pm EST everyday GA is running. Please don't run ads on weekends or holidays. After three months, optimize the ad campaign based on performance objectives. Set the performance objectives as "Contact Page Registrations" per our corporate site."
Massive Complexity: Today, setting up this same task on the current Google AdWords interface will take several days of effort, hundreds of forward and backward navigation events, at least a dozen calls to customer service with no response, and finally, giving up or settling for an ad campaign over which you have minimal understanding and control. If this seems slightly unrealistic, try it yourself, assuming you need to become an AdWords expert.
Programmed Simplicity: Where does LLM ChatGPT fit in the Google AdWords usability problem space? The answer is to utilize a ChatGPT/BARD-style LLM system to dramatically reduce the cognitive complexity of the interaction between the Google AdWords system and the potential client. Instead of creating hundreds of screens and underlying data capture infrastructure, Google can catalog and map user interactions using LLM training sessions into a use-case database that executes against a table of functions in AdWords based on a vast range of user dialogue prompts. By creating a quasi-natural language UI for GA, Google can forget about making a logical screen-based interface flow that users will NEVER understand. The text example above is a likely prompt for such a system. This will allow Google and others to hide UI complexity using natural language prompts (as in the example above), not button clicks and navigation links, not to mention dynamically generated web pages.
Mental Model Flipping: Using an LLM dialogue HCI design means that no matter the user's mental model, the GA system can instantly match a vast range of user mental models instead of relying on creating a standard user's mental model based on a page-based system with virtually no positive transfer of learning. The LLM HCI flips mental model formation upside down so that the LLM training set is a massive mental model that can be conditioned to match the needs of users worldwide, large and small, computer savvy or much less so. This way of thinking dramatically impacts the design of Category 3 systems. Some hard-programmed interface selection features are required for high-risk task confirmations. But that is another story to be resolved by function allocation analysis.
Bloomberg vs. Google: In case you think this is a farfetched concept, this is precisely what Bloomberg just announced concerning its vast repository of FSI data that has been historically accessible through its highly dysfunctional user interface, the Bloomberg terminal. So, we return to where we began by rating Google AdWords as one of the worst UI designs for 2022. But there is hope. Google may lead in using robust LLM software to design complex applications. On the other hand, Bloomberg has a headstart.
The Darker Side: Although LLM technology can produce robust solutions to complex UI design problems, this option has a darker side. Should the current bevy of LLM models, including ChatGPT or BARD, be allowed to cross the air gap between text-based content generation and actual machine control, our future has potentially devastating implications.
When the exceeding clever Sam Altman starts proffering APIs that allow machine control, there is cause for grave concern. In the proposal above related to Google AdWords, there are no societal risks if Google fails to train BARD to control AdWords properly. Still, if ChatGPT were allowed to interface with the power grid, there would be a massive concern for our future. It is at this air gap where Congress needs to establish meaningful regulation and product validation.
Charles Lee Mauro CHFP / President and Founder / Mauro Usability Science https://www.dhirubhai.net/in/charles-l-mauro-chfp/
Mauro Usability Science:?https://www.mauronewmedia.com/
Research / Editing:?Chris Morley / Director of Research MUS
Related Content and Links
Tags
OpenAI ?Apple?Meta?GoogleMicrosoftJeffrey FunkLex FridmanScott Scottgalloway1Andrew OrlowskiJack EwingJames Woudhuysen?Dr. Lance EliotNancy RothbardMegan Miller Prof Bob Stone Missy Cummings Ramy AlSharqawi Trevor McIntyre Anthony Andre, Ph.D., CPE (Tony) John Flach Prof. Nick Colosimo MSc PgC BSc(Hons) CEng FIET FIKE FBCS FRAeS Ian Chong CPE
Designer/Advisor
4 个月When we get a message from a company, like a bank, or here on LinkedIn, which states ,"It works better on the app" they just means they get to collect more of your personal information, like location, time spent at wherever, etc., and it only works better for them, not us.
Principal Analyst Data Governance | Posting commentary for analysts since 2017 | Brier Score of 0.211 | Experimental science: show me the evidence | Veritas filia temporis | Views mine own
4 个月"When one sets out to design a UI for a complex application, the new system must match the existing mental model of users interacting with the new system OR the system design must include massive user training required to construct a workable mental model in the mind of the user. UX designers and software development managers rarely consider this critical research. Instead, they push ahead, creating page-based user interaction flows based on their own intuition of what will be easy to use. This approach is widespread and devastating in terms of usability". Yes indeed, the empty calories of too many a slide deck with notional process flows. I thought there was something afoot in that part of the work flow but found it hard to think my way in. Your term 'massive mental model' is apt methinks
Cofounder and CTO at Artificial Genius Inc.
4 个月Charles L Mauro CHFP unlike other popular use cases for LLMs, what you’re proposing here could actually work. It has to be done correctly though, based on the underlying mathematics, or you’re courting the disaster you allude to at the end. If the LLM is hallucinating instructions for other systems, things could go terribly wrong at scale.
Capital Markets | 30yrs of Startups (multiple exits; IPO) | Aerospace Engineer | Current Focus: Decentralized Venture Models
4 个月Fantastic article, Charles