Where is the ROI in GenAI?
Created with flux pro. Prompt: The robot Scrooge McDuck will be depicted in a futuristic setting, perhaps standing on a pile of coins

Where is the ROI in GenAI?

Everybody is talking about GenAI and how your business needs it. There are countless whitepapers and reports on how GenAI will transform verticals, functions and whole economies by creating billions in value. Decision makers are left between FOMO (fear of missing out) and GenAI fatigue (getting tired of the newest story of a company introducing xyzGPT which is yet another basic chatbot wrapper).

When doing some research you will find a lot of those reports, a lot of quantified high level potentials and also a lot of use cases where GenAI is praised as the holy grail. But there are not too many companies publishing the actual results they achieve for investing money into talent, consultancies, inference costs and software. This might be because a lot of companies are still experimenting or just believing in it or not knowing how to actually measure the ROI of this new technology.

Don't get me wrong. I believe the technology is changing the world and there is more to come. But when you are investigating the use of a new technology you need to build clear business cases and not only use cases and you must know how to measure and benchmark invest.

And to actually provide you with some value and not only my thoughts I have spent several weekends searching the internet for you on the quest to find companies actually talking about their quantified achievements by using GenAI. Unfortunately there were not too many of them when I was doing my research a couple weeks back.

Those use cases with quantified results span content creation, customer support, insights & analytics, synthetic data generation. While I might create an overview on more use cases in the future, I will not add use cases or collections without clear measured outcomes in this article.

Customer Support:

Klarna was one of the most prominent examples here on LinkedIn on the achievements of their customer support bot. It is a very nice example because it identifies several levers to create value for the customer and the organization while also showing a clear priorisation where the focus in improvement lies for Klarna.

The benefits were:

  • The AI assistant has had 2.3 million conversations, two-thirds of Klarna’s customer service chats
  • It is doing the equivalent work of 700 full-time agents
  • It is on par with human agents in regard to customer satisfaction score
  • It is more accurate in errand resolution, leading to a 25% drop in repeat inquiries
  • Customers now resolve their errands in less than 2 mins compared to 11 mins previously
  • It’s available in 23 markets, 24/7 and communicates in more than 35 languages
  • It’s estimated to drive a $40 million USD in profit improvement to Klarna in 2024

So the main improvement goal was efficiency. In this case it seems like it was properly measured by the number of support tickets solved by the bot also labeled for more granularity. It was also benchmarked, so it seems that Klarna knows how many tickets are typically solved per agent and so they can compare the unit economic costs per ticket for an agent and for the bot. There are also some other metrics they were measuring such as customer satisfaction and service times.

The main take-away should be: if you are implementing efficiency use cases you need to have insights into the costs you are generating as-is and what level of improvement you are aiming for. You should also know what the drivers of those costs are (e.g. here to some extend repeat inquiries) and how your solution is tailored to tackle this specific driver (e.g. paraphrase detection, labeling, choosing & expanding the knowledge base). From here there are many possibilities to enhance the value of the bot. More on those in some post in the future.

Content Creation:

The key drivers are:?

  • Cost Reduction in External Agency Expenses: Klarna has decreased its spending on external marketing suppliers by 25%, including translation, production, CRM, and social agencies, with run rate savings of? $4 million.
  • Savings on Image Production: Achieved a $6 million reduction in image production costs, despite running more campaigns and creating significantly more images. Using genAI tools like Midjourney, DALL-E, and Firefly for image generation—and Topaz Gigapixel and Photoroom for final adjustments—Klarna saved $1.5 million in the first quarter of 2024 alone.
  • Increased Efficiency and Creativity: Generated over 1,000 images in the first three months of 2024 using genAI, reducing the image development cycle from 6 weeks to just 7 days. This acceleration includes checks for brand consistency, image quality, and legal compliance.


Key Results:

  • 90% time savings
  • 200+ micro videos
  • up to $1,500 in cost savings per employee


Key Results:

  • 100+ courses with AI video
  • improve production speed from months to weeks


Results:

  • over 1.000 new personalized experiences
  • 80% reduced cost per creation

I think Content & Media generation are probably one of the most low-hanging fruits in GenAI. And it is also where we are just at the beginning and so much is happening in this space.

Synthetic Data


Source:

Results:

  • Assuming a constant number of fraudulent claims of 100 per, the improvement of recall from 62.5% to 93.75% leads to capturing around 31 more fraudulent exposure per year
  • Assuming an average of $10,000 in savings per exposure, it can then be translated into an annual $310,000 in savings achieved

I think this is one of the more exciting use cases of GenAI. Because here the technology is used in a smart way to enhance existing capabilities. It is also very observable because there is a process in place and it takles a high stakes environment.

What about the costs?

An interesting perspective on TCO for chatbots by McKinsey, source:


In order to calculate a ROI for your investment you do not only need to evaluate the benefits you can reap but also the investments to generate them. This TCO overview by McKinsey can serve as a first idea. I think for creating advanced chatbots with expanded capabilities they are overestimating the importance of finetuning and underestimating the costs for product managers and software developers. At the same time the cost and time for integration and building and maintaining the plug-in layer will go drastically down in the near future, making adoption significantly easier. Thus inference and FTE to expand capabilities will be the main cost factors, while inference so far trends towards becoming cheaper and cheaper and can be measured nicely on a unit economics level. Also it is a very strategic decision if one wants to go the route of fine-tuning (often out-of-the-box performs "good enough" for the beginning). The integration costs will go down for sure. But the fine-tuning costs can not be looked at as one off. When new foundation models emerge it needs to be re-evaluated how the previous fine-tuned model performs against the new model and if it makes sense to again fine-tune the new model (probably). And when all companies leverage GenAI to some extent and all the publicly available data has found it's way to the foundational models the competitive edge might finally be the proprietory data. Side note: At the moment I also do not see a significant advantage of choosing industry models or DIY models over foundation models (especially if they are finetuned) other than in very limited use cases.

I hope these examples were interesting. Let me know which other examples of quantified results for specific GenAI use cases you are aware of or let me know where you see to biggest objective to realizing or measuring ROI!


Sources:

Klarna Chatbot

Klarna Marketing

Zoom x Synthesia

IU x Synthesia

Lonely Planet x AWS

MAPFRE

McKinsey


要查看或添加评论,请登录

社区洞察

其他会员也浏览了