Why NPS and CSAT don’t work for measuring your AI efforts

Why NPS and CSAT don’t work for measuring your AI efforts

When evaluating AI initiatives, I see many organisations use NPS and CSAT as barometers of success. But there’s a problem with both of those metrics that make them ineffective at measuring the success of interactions with AI services.


This edition of the newsletter is sponsored by Intel?.

Our world is determined and shaped by invisible forces. Here at VUX World, we believe that just like the four fundamental forces of nature, we’re influenced by the power of AI. Unlike the forces of nature, you have to seek the power of AI. Now it’s easier to discover the true power of AI PC for your business and team with Intel? Core Ultra processors.

Unlock the power of fast and reliable AI PCs here: https://intel.ly/3A3Ziwq


This has been a top of mind topic lately. I've written about how containment is the wrong metric a long while ago. Now, it's time for NPS and CSAT to take their turn. We covered this in detail in the conversation I had with Pypestream CEO, Richard Smullen on the VUX World podcast , and it's a conversation I've been having with clients for almost 2 years. The eagle eyed among you might be thinking "Hold on a minute, Kane, you recently posted about how the exact same way you measure value today is the exact same way to measure the value of AI tomorrow. I measure value with NPS and CSAT, so what gives?"

As you'll see in this post, there's nothing inherently wrong with NPS and CSAT, depending on what you're trying to measure. There are three levels that you want to measure success within your business and with AI:

  1. Business level metrics like cost, revenue, loyalty and such.
  2. Journey level metrics like satisfaction and goal completion,
  3. Interaction level metrics like experience, adoption and usefulness.

What’s wrong with NPS?

NPS (Net Promoter Score) is a long-term business-level metric that measures how likely a customer is to recommend your brand to a friend. For decades, this has been a keen indicator of happy customers and loyalty.

However, you can’t use NPS to judge the success of an interaction with AI. Because it’s a long term metric, it takes into account everything the customer knows and feels about your brand. It’s not intended to measure the interaction the customer has just had, but in general terms, how loyal they are likely to be. One interaction isn’t enough to sway customer loyalty in many cases.

Let’s say you’re a bank. You have a chatbot. Your customer uses it and then you ask them an NPS question. That doesn’t tell you how effective your chatbot is. It tells you how loyal that customer is.

For example, take Mary. She’s banked with The Bank of Kane for the last 15 years. She trusts the Bank of Kane with her life savings and mortgage. Yet, she has a poor experience with the Bank of Kane’s chatbot (I know, not likely ;) but play along). When Mary is asked the NPS question, she’s bringing all of her trust and baggage with her. It’s likely she’ll still give a high NPS because she trusts the bank with her life savings!

Now, you might rephrase the question and ask ‘based on your experience of this chatbot, how likely are you to recommend The Bank of Kane to a friend?’ which I’ve seen before. But that’s conflating two types of questions: a short-term question asking about the experience itself and a long-term measure of loyalty. It doesn’t sufficiently differentiate between how you feel about the brand overall vs how you feel about the experience you’ve just had. It’s like a travel agent trying to determine long-term loyalty by asking a question about the flight. Some freak turbulence might have made the flight a nightmare, but that won’t affect the overall holiday.

The Einstein quote; “If I had an hour to solve a problem and my life depended on it, I would use the first 55 minutes determining the proper question to ask, for once I know the proper question, I could solve the problem in less than five minutes“

When trying to quantify the effectiveness of an interaction, NPS is the wrong question.

What about CSAT?

Let’s take CSAT instead, surely that’s a better metric? Well, no, not really. Not if you’re trying to measure the interaction itself.

CSAT (Customer Satisfaction) is a short-term metric that measures how satisfied a customer is with a service in the moment. Generally speaking, it’s a good measure and will tell you whether there are any service issues you need to deal with. But it won’t always tell you about the effectiveness of the specific solution a customer has just interacted with.

For example, Jim is also a customer of The Bank of Kane, and Jim wants to take out a new credit card. So he has a chat with the Kane AI Agent that tells him that his credit rating isn’t high enough and that there isn’t a credit card sufficient for him. Then, he’s asked a CSAT question: how satisfied are you with this service?

What do you think he’ll say? Unsatisfied, of course. Why? Because he didn’t get the outcome he was hoping for.

Notice Jim’s dissatisfaction has nothing to do with the experience of interacting with the AI agent. It’s got everything to do with the fact that he was told something that he didn’t like and he hasn’t got the result he wanted. Even though the Kane AI Agent did everything absolutely right and was able to tell Jim whether he’s eligible in 30 seconds.

So what’s the solution?

The solution: effort

Customer Effort Score asks how effortful the interaction you’ve just had was. It’s got nothing to do with how you feel overall about the brand and so it cuts through the baggage and bias you have. It’s got nothing to do with how you feel about the outcome and result you’ve just received, so it doesn’t penalise a perfectly working service. It simply appraises the interaction you’ve just had and whether or not it was easy or difficult to use.

That’s the score that matters when trying to measure the effectiveness of the interaction with your AI solution. Why? Because why else do you want to use AI? Surely because it’s easier, more efficient and more effective than other channels, modalities or technologies? How do you measure that? Effort.

We have a formula for measuring effort which takes into account turn count, escalations, abandons, fallbacks, disambiguations and goal completion that I’d happily share more about for those interested.

Is it perfect? No. Is it good enough? Absolutely.

So, use NPS and CSAT to measure loyalty and satisfaction on the whole, but not to measure the effectiveness of the interaction itself. Use effort for that.


About Kane Simms

Kane Simms is the front door to the world of AI-powered customer experience, helping business leaders and teams understand why AI technologies are revolutionising the way businesses operate.

He's a Harvard Business Review-published thought-leader, a LinkedIn 'Top Voice' for both Artificial Intelligence and Customer Experience, who helps executives formulate the future of customer experience ad business automation strategies.

Hi consultancy, VUX World , helps businesses formulate business improvement strategies, through designing, building and implementing revolutionary products and services built on AI technologies.

  1. Subscribe to VUX World newsletter
  2. Listen to the VUX World podcast on Apple , Spotify or wherever you get your podcasts
  3. Take our free conversational AI maturity assessment


We use a collection of metrics to measure success with our bot, Escalation vs Containment, Customer Effort, CSAT, User Query vs Bot Response analysis and a few others. It's the collections of these that tell the story of the user experience, which for us is a primary goal. Did we give them what they needed when they needed it? Was it easier or harder to get the right answer? Did we even have the correct answer in the content/flow?

回复
Iqbal Javaid

Head of CX Solution Engineering EMEA at Zoom | ???Unleash CX Podcast | ?? Follow for the latest news on AI, CX & Zoom

1 个月

So on point Kane! As consumers our initial apprehension around effort when contacting a brand is such a key factor. Organisations should be thinking hard about driving and prioritising CES, but so many struggle to collect enough meaningful data to redesign the right customer journeys. I think we are now close to seeing how AI can play a key role in determining CES & CSAT, based on various factors in addition to surveys. Question is could we use some of this self-learning to automate change?

回复
Rich Mobley

?? Multi award winning Speech Analytics and Voice of the Customer expert

1 个月

Agree. We use customer effort as the key measure for gaining meaningful actionable insight on all journey points. Great post ????

Jesse Chen

Creating Impactful CX Teams @ Charm ? | Gartner Alum | Co-Founder ??

1 个月

I'd love to learn more about how you're setting up CES

要查看或添加评论,请登录