Leveraging OpenAI’s Whisper and ChatGPT with UiPath for the Contact Center

Leveraging OpenAI’s Whisper and ChatGPT with UiPath for the Contact Center

A guide to gaining insights from contact center recordings and transcriptions with AI and automation

Alright everyone, this guide is mostly about me jumping on the OpenAI bandwagon. It seems to be getting full and I am frankly feeling a bit of fomo.??

?To start, this guide came about thanks to the convergence of a few events:?

  1. Requests for a simple, cost effective (i.e., free) method to analyze call recordings, voicemails, and similar audio, and then automate based on insights from AI.?
  2. UiPath’s preview release of a connector for OpenAI. I started out with the intent of just spending an hour playing around with it, but quickly found it to be a powerful (and addictive) tool combining AI and automation.?
  3. And the oh-so-fun event of IT sending me a note that my PC may be compromised due to it running an older version of Python, and I thus needed to upgrade it stat. Stick with me, the relevance of this will be clearer later.

Let’s get started. First, a little about this guide:?

  • If you are not interested in the technical bits and bytes, choose your own adventure, scroll to the end for the summary
  • Assumes you are familiar with UiPath and have UiPath Studio?
  • Assumes Python is installed on the machine or VM where you will run OpenAI Whisper?
  • Assumes you are familiar with PowerShell?
  • This is for Windows; Mac and Linux options are available but not covered in this guide?
  • This guide shows the art of the possible and is not “production worthy.” I have included ideas for production in the guide.?
  • This guide uses OpenAI’s Whisper for speech-to-text. This fulfills a common request for a “free” option. There are many speech-to-text tools that may be a better fit for your needs. They are not covered in this guide.?

Step #1: Install OpenAI Whisper?

We will first install Whisper. If you are going to have problems it will likely be here and I don’t want you to go through all the other steps if this doesn’t work.?

  1. Open a command prompt as an admin?

  • Window + X
  • Select Windows Powershell (Admin)
  • Confirm permissions??


Implementation Note?

I installed Whisper on my local PC and I am using a UiPath Attended robot. For a production implementation you may want to use a UiPath Unattended robot and would thus need to install Whisper on the VM(s) with that robot.

Whisper also has an API. This may be a better option for environments where running Python isn't practical.??


2. Check that you have Python installed by checking the Python version number?

  • Enter py --version into the PowerShell command prompt
  • A version number should appear. If it doesn’t, you need to install Python or find a machine that does have Python. I am not covering Python installation in this guide because there are too many factors to consider and there’s a lot of content available on the topic. Just ask ChatGPT.

3. Install and upgrade PIP?

  • Enter py -m pip install --upgrade pip?

4. Use Chocolatey to upgrade python?

  • Enter choco upgrade python -y
  • Note: Recall above I mentioned that IT asked me to upgrade Python. That's how I got here. I was already this far so I thought I should give this Whisper tool a try.??
  • Tip: I believe when you install and upgrade PIP it will also install Chocolatey. It was already on my PC. If the choco command doesn’t work, you may need to install Chocolatey manually.?

5. Install ffmpeg?

  • Enter choco install ffmpeg
  • Enter pip3 install python-ffmpeg?

6. Install OpenAI Whisper (finally!)?

  • Enter pip3 install git+https://github.com/openai/whisper.git

7. Take Whisper for a test drive?

  • Enter whisper --model base [INSERT THE PATH TO YOUR RECORDING]?

Example: whisper --model base “C:\Voicemails\Mar\Call0000001.m4a”?

  • Tip: If you know the language of your recordings, add that as a parameter for faster processing?
  • Tip: Enter whisper at the prompt to see a list of all parameter options and languages supported?
  • Tip: Whisper will generate several files including a text file of the transcribed audio. You might want to cd to a directory where you would like these files stored.
  • Tip: If you don’t have audio recordings handy, use Windows Voice Recorder to create and save a recording

Step #2: Use Whisper in UiPath?

Another reminder that this is an art of the possible guide. You wouldn’t want to do some of what I have done below in a production implementation. :)?

  1. Open UiPath Studio, start a new blank project?
  2. Drop a For Each File In Folder UiPath Activity onto the sequence?

  • Tip: You may need to add the dependency for this and other activities mentioned below via Manage Packages in UiPath Studio.?
  • Set In Folder to the folder where your audio recording files are saved?
  • Set Filter By to *.m4a (or the file extension(s) used for your recordings)?


Implementation Note?

In my example I will loop through the audio recording files using For Each File in Folder, convert them to text, and then move the files to an archive folder. You will need to research the options for accessing the audio files in your environment and design the best option for you (i.e., trigger based, scheduled, file system versus web interface, etc.).?


3. Drop an Invoke Power Shell UiPath Activity into For Each File In Folder?

  • Open Properties for the activity?
  • Set Continue on error to True?
  • Note: Whisper may throw a processor error when starting. Whisper handles this error on its own, but when Continue on error is not set to True it causes the automation to stop.?
  • Set Command text to "whisper --model base '" + CurrentFile.FullName + "'"?
  • Tip: Mind the single and double quotes. You need them.?
  • Tip: CurrentFile should be the default variable in the For each property of the For Each File in Folder activity above. If it is different, replace CurrentFile with your variable name.?
  • Set Is Script to checked?


Implementation Note?

I believe entering as a script is definitely not a best practice. I am using in this example for maximum flexibility and speed in testing the various options available with Whisper. UiPath provides Input and Parameters properties for Invoke Power Shell that could be used to improve the durability of the automation.?

Also, I am using PowerShell because it’s what I know. UiPath has Python activities that could be used in your automation for interacting with Whisper. Leveraging the Python activities is not covered in this guide. Using the Python activities in place of or together with PowerShell may improve your automation.?


4. Add an Assign activity after Invoke Power Shell?

  • Note: Whisper will generate a .txt file and we want to read that file, but we first need a variable that will point to it. I am simply taking the file name and replacing the extension with .txt.?
  • Create a variable named newString as type String?
  • Create a variable named oldString as type String and default value CurrentFile.Name?
  • In the Assign activity set newString = oldString.Replace("m4a","txt")?
  • Tip: Replace m4a with your file extension or use other String functions to replace any extension with txt?

5. Add a Read Text File activity in the For Each File in Folder activity after Invoke Power Shell and Assign?

  • Create a variable named txtOutput as type String
  • Set Output to txtOutput
  • Set Filename property to "INSERT DIRECTORY WHERE WHISPER IS SAVING THE TRANSCRIPTION" + newString??

6. Use a Message Box (or similar activity) to display txtOutput?

7. Run and test the automation

Step #3: Get insights from OpenAI ChatGPT?

In this step we will setup the UiPath Integration Service connector for OpenAI and add it to our automation. If you don’t have access to Integration Service, this is the end of the adventure, but there’s still a lot you can do with the txtOutput from above, so go have some automation fun!?

  1. Get an API key from OpenAI?


Implementation Note?

This will create a personal key. For a production implementation you may need to work with your IT team to create an enterprise key.


2. Configure the OpenAI connector in UiPath Integration Service?

  • Navigate to Integration Service?
  • Click on the OpenAI connector?
  • Click Add New Connection?
  • Enter “Bearer INSERT YOUR API KEY FROM TASK 1 ABOVE?
  • Tip: You don’t need the quotes, but you do need Bearer and a space between it and your API key. And be sure to remove any trailing spaces from your key.?
  • Click Connect?
  • You will see a successful connection in Integration Service?

3. Configure the automation to use the OpenAI connector?

  • Return to the Whisper automation file from Step #2?
  • Open Manage Packages and add the UiPath.OpenAI.IntegrationService.Activities package?
  • Drag a Generate Text Completion onto the sequence after the Read Text File activity?
  • Set the Connection property to the OpenAI connection from Task #2 above?
  • Tip: If you don’t see an option to select in the Connection, Studio may not be connected to the instance of UiPath where Integration Service is running. Check your settings.?
  • Create a variable named collection of type UiPath.OpenAI.IntegrationService.Client.GenerateTextCompletion?
  • Click Show advanced options, scroll to the bottom and set Response to the collection variable, click Hide advanced options?
  • Set Prompt to anything you want to ask ChatGPT, but use the txtOutput variable in the prompt.?
  • Example: “Summarize the following transcript into an after call work note:” + txtOutput
  • Use a Message Box (or similar activity) to display collection.Text, it contains the response from ChatGPT?

Summary

It seems like a lot (it's not)* but once you have this plumbing in place there are many AI plus automation opportunities that light up for the contact center. In this example we took an audio recording, transcribed it with OpenAI Whisper, summarized it with OpenAI ChatGPT, and can use UiPath to update the customer’s record. We might also ask ChatGPT to categorize calls and intents, and then use UiPath to store the data and begin to detect trends. We can use sentiment from ChatGPT to have UiPath update voice of the customer data or in performance reviews. At the start of a call UiPath might download and send recent bills to ChatGPT to summarize the differences for the agent. During the call, ChatGPT might provide insights on next best actions and offers, and UiPath will automate the action on the agent’s behalf.?We could go on and on (and I will try to in a future post)**.?

Finally, if you made it this far you are probably keen to learn more about AI and automation. Here is my plug for UiPath’s AI Summit. You will hear from top UiPath product experts and customers about the opportunity and business value of AI and automation. Register below to view the recordings.??

?https://www.uipath.com/events/ai-summit?


* I only spent 3 hours setting this up and that includes nearly an hour of troubleshooting a couple knucklehead mistakes I made in the configuration.

** I also know a visual of this solution in action will help and I will try to post a video later.

Craig Bannerman

Lead Architect at VKY Intelligent Automation

1 年

Had a go following your steps, so thank you for breaking it down. Took more of the CMD Prompt route (just what I know better) but was able to get Whisper installed and running locally. Thanks for the post Brad

回复
Ender VURAL

Siyad Sistem Ltd. ?ti. ?irketinde i? Geli?tirme Uzman?

1 年
回复
Ariana (Conti) Gula

AI-Powered Healthcare Automation

1 年

well done, Brad!

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了