Developing a company-specific chatGPT for?Enefit
Image by Author with the assistance of DALL E

Developing a company-specific chatGPT for?Enefit

An overview of developing and measuring a value of a company-specific chatGPT for the employees of a large energy company

Problem?

Enefit is one of the largest energy companies in the Baltics, employing over 5,000 people and operating in five different markets (Estonia, Latvia, Lithuania, Poland, and Finland). Over the decades, the document base directed at the company’s employees has grown to thousands of documents (strategy, rules, procedures, process descriptions, guidelines, etc.), and ideally, employees should be aware of all of them.

When we asked employees to measure the time it takes to find answers to company-related questions, the results were as follows:

  • On average, it takes 18 minutes to find answers from internal documents (median);
  • For about 10% of the questions, it takes an hour or more to get an answer (waiting for a response from a manager or colleague);
  • For about 5% of the questions, no answer is ever received (searched for a couple of hours and then given up).

Thus, it can be said that employees are unable to navigate the maze of orders, rules, and guidelines. As a result, efficiency is lower than it could be, and the execution of processes deviates from the ideal.

If an employee has to search for answers to an average of two questions per day, they spend more than 150 working hours per year on this activity (i.e., 252 working days 2 questions 18 minutes = 9072 minutes or 151 working hours).

Development process

To address the problem, we began developing a custom ChatGPT for the company this spring. The goal was to create a solution based on GPT language models that could answer any question about the company’s operations, whether it’s about strategy, processes, or health insurance.

At the beginning of the development activity, we lacked an understanding of the feasibility of creating such technology. Therefore, we followed the Plan-Develop-Learn principle, where users test each step of the development, and based on their feedback, the subsequent development tasks are defined:?

  1. Plan?—?formulate a development hypothesis you want to test;?
  2. Develop?—?carry out the necessary developments;?
  3. Learn?—?test the developments on users and summarize the results.

The project’s scope and the number of testers grew with each iteration. The goal of the first iteration was to test whether it’s fundamentally possible to create a company-specific ChatGPT that could provide answers based on our own documents.

In the second phase, we focused on documents from a specific area and included specialists familiar with those documents as testers. The aim was to prove that artificial intelligence can provide answers that are logical for specialists.

In the third phase, we expanded the document base, feeding the system all the documents directed at employees, and increased the number of testers to about 80. These testers were selected from 14 different units and all five home markets.?

In the final phase, the document base will no longer be expanded, and the chatbot will be implemented.

The development process of company-specific virtual agent (illustration by Author)

How It Works?!?

The developed virtual assistant operates on the RAG (Retrieval Augmented Generation) principle. First, we’ll look for the relevant pieces of information from the vector database. Then, the GPT model composes an answer. The generalized workflow is illustrated in the following diagram:

  1. Users pose questions through the Microsoft Teams interface?—?a conversation with the virtual assistant can be initiated like a chat with a team member.
  2. Semantic search is used to find documents that are most likely related to the user’s question.
  3. Relevant documents, along with the question and answering guidelines, are sent to the GPT-4 model.
  4. The answer provided by the GPT-4 model is sent to the user.

The workflow of GPT-based retrieval augmented generation (Illustration by Author)

Initial results

Throughout the project, we have learned a lot. Here are some of the most significant and interesting discoveries:

  • With the help of a virtual assistant, it takes an average of 30 seconds to find answers from internal documents. This means a time-saving of 17.5 minutes for each question (without the virtual assistant, the time taken was an average of 18 minutes).

If an employee has to search for answers to an average of two questions per day, then virtual assistant helps to save over 140 working hours per year (i.e., 252 working days 2 questions 17.5 minutes of savings = 8820 minutes or 147 working hours).

  • The virtual assistant offers potential to elevate document management to a new level?—?we’ve repeatedly seen during tests how the chatbot points out outdated documents that the document owner should update.
  • This technology changes existing habits, meaning its successful adoption is critically dependent on implementation and training for both employees and document owners.

Conclusion?

This was a brief overview of the project to create a customized chatbot for Enefit?—?an energy company with more than 5000 employees in 5 home markets. I will write more about various aspects of the project in the future: testing, measured value, challenges, and our vision for the solution.


Kristjan Eljand

Building AI & optimization solutions since 2011

11 个月
回复

This looks nice, great work! I do wonder though, i have been looking into similar solution, focused mainly on using Enefit Power documentation. I did some reading and poking and what often came up in terms of "company specific" chatGPT solution, was open-sourceed project called quivr (https://github.com/StanGirard/quivr). Is this particular solution to a degree in any way perhaps related to quivr? Also, is there perhaps a github repo for more interested readers?

回复
Marlon Dumas

Chief Product Officer at Apromore | Professor at University of Tartu

1 年

Great case study, very concrete. No critique meant here, but I am not sure that labor cost savings will drive business adoption of LLMs in the field of "enterprise search". The driver will most likely be what people can do that they did not do before, like finding richer answers to their questions, discovering new relevant questions. issues, risks, or opps they were not asking (e.g. discovering outdated documents, high-risk or non-compliant practices, etc.) You hint at it. Hard to measure at the beginning, of course, though it can be partly measured via user satisfaction scores.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了