The promise of faster, more efficient customer service via AI (and its host of challenges)
Photo by Google DeepMind on Unsplash

The promise of faster, more efficient customer service via AI (and its host of challenges)

Many businesses struggle to hire and equip enough employees to answer customer questions, troubleshoot issues, or check the completeness and correctness of medical claims, to name a few customer care related tasks. Enter the AI chatbot to help handle customer requests at scale in near real time.

For a entertaining example, watch AI vs. the Drive-Through, in which WSJ senior personal tech columnist Joanna Stern ordered from an AI-powered drive-though around 30 times to find out how well it works. Turns out not too bad, even when she changes her mind, plays dog barking sounds, or asks weird questions.

Source: Drive-Thru AI Chatbot vs. Fast-Food Worker: We Tested the Tech | WSJ

Prior to Generative AI, chatbots used to reduce the need for live agents to handle mundane tasks were typically supported by Natural Language Processing and rules-based engines. And while these traditional conversational agents can be good at quickly surfacing answers buried deep within a company’s FAQ, they aren’t equipped to navigate the same level of nuance and context that ChatGPT and other large language models (LLMs) can.

Studies are starting to show how GenAI assistants can increase the productivity of customer service agents and improve metrics like time to ticket resolution. The difference is felt primarily among novice workers, which is good news that can be translated into fewer job requirements, wider candidate pool, and accelerated learning curve for new hires. Well-designed chatbots and AI assistants can equip inexperienced customer service agents to better handle customer issues and decrease the risks of negative outcomes such as customer dissatisfaction and employee burnout and attrition.

But conversational agents are also an emblematic example of the unique risks presented by GenAI-powered applications:

  • With a rules-based chatbot that relies on a predefined set of rules to generate responses to user inputs, it’s often acceptable to create a process to catch issues after the fact and update the solution later.
  • An AI chatbot, on the other hand, needs to be designed with additional guardrails to prevent unexpected responses that may include incorrect, inappropriate, harmful, or even illegal content.

For organizations interested in leveraging GenAI to improve their customer care functions, the following steps can substantially mitigate the risks:

1. Evaluate your use cases against GenAI strengths and weaknesses

Even the latest generation of LLMs suffers from various well-documented weaknesses common to language models. A good understanding of the limitations and potential methods of error mitigation is key to choosing the right scope of assistance to be delegated to generative AI.

For example, when planning an AI assistant, if the idea is to allow customers to inquire about the status of work orders, will the model have access to the relevant records so it can provide a fact-based response without fabricating content? If partners can ask for order estimates, will it be able to combine user provided data such as product quantities and customizations with information from the various source systems that contain the relevant rules for error-free pricing?

Depending on the limitations on grounding the AI assistant responses with verified data, the scope of the use case may need to be adjusted. For example, from an application that interacts directly with customers to an internal-facing assistant that focuses on knowledge retrieval and simple calculations to make customer care agents more efficient.

2. Involve experts in the prompt engineering process

Prompts?—?the inputs or queries used to elicit specific responses from a large language model?—?play a crucial role in the quality of the output of a GenAI-powered solution.

For an AI chatbot to produce fact-based responses that are grounded in the appropriate source material, it must be offered carefully constructed prompts. Depending on the use case, this may require in-context instruction tuning.

When providing product pricing information, for instance, rather than using the original question directly as the prompt, the system may need to include additional instructions directing the model to use the company’s quoting tool to retrieve the correct estimates to include in the answer.

3. Dedicate enough time and resources to plan and execute extensive testing and validation

While this step is essential for any software application, with GenAI the stakes are higher. For instance, asking a rules-based chatbot “Are you sure?” after receiving an answer will not cause the original answer to change. With GenAI, on the other hand, it's quite possible for this kind of follow-up question to trigger a contradictory response.

Pre- and post-production testing practices should be designed by professionals with deep understanding of the weaknesses and quirks of large language models, and offer a confidence percentage that GenAI responses are aligned with company policies and regulatory standards.

Test plans must be created with security risks in mind, including, among others, prompt injection, data leakage, and supplier risks if the business is buying a third-party product or service that includes generative AI.

4. Make sure your first generation AI assistant is well supervised by humans

No matter how robust your testing process is, it will never cover all potential pitfalls associated with adopting a language model as a conversational agent.

To avoid negative (or even disastrous) surprises, it is advisable that the first iteration of an GenAI-powered assistant is deployed to support or augment the work of humans while model robustness is tested.

For example, instead of interacting directly with the customer, first it can be used to help a salesperson articulate product features and benefits during a sales call, or a technician identify the possible causes of an issue during a support chat.

In the case of a drive-through assistant, for a period an attendant could be listening in to confirm that the chatbot is not only taking orders correctly, but avoiding undesired behaviors like repeatedly asking if the customer wants a peach pie after she declined the offer.

Source: Drive-Thru AI Chatbot vs. Fast-Food Worker: We Tested the Tech | WSJ

Once matured and reliably useful, the bot should be able to handle simpler, more straightforward tasks on its own, while still being programmed to escalate the situation to a human in scenarios like a top business customer attempting to negotiate a price quote or a person asking an allergy question while placing a food order.

A case in which the chatbot failed, responding "No" when the answer should be "Yes".

5. Establish a workflow for output checking and filtering

The goal of this step is to ensure that GenAI responses are safe, ethical, mindful of the rights and privacy of everyone involved, and aligned with company policies and regulatory standards.?

A robust monitoring solution for model outputs may require a blend of traditional rules-based filtering and machine learning models, and include a process to hold back dubious responses to be reviewed by a human employee before they are forwarded to the end-user.

6. Identify and implement the appropriate mechanisms to enable continuous learning and?refining

To ensure that the AI assistant continues to behave optimally in a dynamic environment, three essential elements should be in place:

  • Post-production testing practices that check for things like bias, adherence to regulations and internal policies, and vulnerabilities to adversarial attacks, suggesting actions for improvements.
  • Automatic monitoring tools that can issue alerts when anomalies such as broken pipelines or concept drift require action, as well as automatically block, flag, or correct toxic or erroneous outputs.
  • User feedback mechanisms that help produce answers to questions like, "Does the AI assistant understand the user’s intent?", and "Are its responses appropriate to the user’s query?"

_____

Generative models, despite their immense potential to improve customer experience, create new vulnerabilities. And the closer AI gets to a company’s core business and customers, the harsher the consequences of model failures become.

Building and operating GenAI assistants in a legal, safe, reliable, and fair manner for customers and employees requires detailed attention to all stages of the AI solution lifecycle, from properly evaluating use cases to staying on top of emerging security risks. It may sound like an insurmountable challenge, but with a deliberate, structured, and de-risked approach to generative AI, organizations in any sector or industry can deploy it responsibly and productively for better user experiences in the customer care space.



You may also like:

Why generative AI is just the latest technology getting in the way of smart data management

AI rich, insight poor: The challenge plaguing corporate generative AI efforts in customer service

要查看或添加评论,请登录

Adriana Beal的更多文章

社区洞察

其他会员也浏览了