Build a Personalized Chatbot Powered by Kumo AI Embeddings and Predictions
Follow along in my demo video .
Motivation
Imagine you are a user entering a store, looking for items to purchase. Having a personal assistant that can give you customized real-time recommendations for how to find the best item given their past purchase history and constraints can be extremely beneficial to a user’s shopping experience.
The queries a user can ask an agent of this kind can be divided into two main categories:
An itemtouser query is the one that we are all most familiar with; what items should I buy given my personal purchase history? These are tailored queries for a specific customer, suited to their preferences. An itemtoitem query is not specific to any one customer; these are more generic queries about what items go well with each other, what items complement each other, etc. Having a chatbot that can answer both of these types of queries can enhance the user experience significantly.
Overview
There are several steps in order to build a chatbot like this:
This blog will be going into depth about each of these sections.
1. Obtaining Kumo Predictions and Embeddings
Our chatbot can answer two types of queries: itemtoitem and itemtouser. Let us look at each of these cases separately.
ItemToUser
For an itemtouser usecase, we want to obtain the top k item predictions for every customer. Kumo makes this very easy to obtain.
We use the publicly available H&M dataset . The original dataset has 3 datasets:
These three datasets serve as the tables we can use to construct our Kumo graph. We link the article_id and customer_id from the articles and customers table respectively to their foreign_keys in the transactions table.
With this graph, we can run the following PQuery :
PREDICT LIST_DISTINCT(transactions.article_id, 0, 30, days)
RANK TOP 25
FOR EACH customers.customer_id
This PQuery obtains the top 25 items for every customer ID over the next 30 days. Once training the Kumo model using this PQuery, we can run a batch prediction workflow, and obtain two main pieces of data:
ItemToItem
For the itemtoitem use case, our setup looks a bit different. We have two articles tables, duplicates of each other. In this use case, we can think of the nodes in our constructed graph as the items, and an edge between any of the nodes as a co-occurence relationship. Both of these articles table then link to a transactions table, that we can think of a co-occurence table.
We can run the following PQuery:
PREDICT LIST_DISTINCT(transactions.article_id2, 0, 30)
RANK TOP 10
FOR EACH articles.article_id
Once training the Kumo model using this PQuery, we can run a batch prediction workflow, and obtain the following pieces of data:
After running both of these batch prediction workflows, we now have the predictions and embeddings for both the itemtouser and itemtoitem use cases.
2. Storing Data in Database
Now that we have the Kumo predictions and embeddings, our next step is to store the data in a database in a format that is easily searchable. I will be using Elasticsearch to store the data.
Elasticsearch is structured into numerous indices, each with their own specific index mapping. For this application, there are a total of 8 indices in our database: item information, customer information, embedding indices (x4), and prediction indices (x2). Let us investigate each of these indexes further.
Item Information Index (#1)
This is an index for storing basic information about every item, taken directly from the H & M dataset. Here, instead of storing each attribute as it is stored in the original dataset, I combined attributes into one field that is easily searchable. This way, users can also query by item name or any other item attribute instead of having to specify the item id in the query.
For instance, consider the following item from the original dataset:
{"article_id": 108775015, "prod_name":"Strap top", "product_type_no":253, "product_type_name":"Vest top", "product_group_name": "Garment Upper body", "graphical_appearance_no":1010016, "graphical_appearance_name":"Solid", "colour_group_code":9, "colour_group_name":"Black", "perceived_colour_value_id":4, "perceived_colour_value_name":"Dark", "perceived_colour_master_id":5, "perceived_colour_master_name":"Black", "department_no":1676, "department_name":"Jersey Basic", "index_code":"A", "index_name":"Ladieswear", "index_group_no":1, "index_group_name":"Ladieswear", "section_no":16, "section_name":"Womens Everyday Basics", "garment_group_no":1002, "garment_group_name":"Jersey Basic"}
When storing this data into Elasticsearch, we simplify this item into simply two columns:
{“item_id”: 108775015, “item_info”: Item 108775015 is called 'Strap top'. It is a Vest top (product type number 253) under the Garment Upper body product group. This item has a graphical appearance of Solid (appearance number 1010016) and belongs to the Black colour group (colour group code 9). It is perceived as Dark (perceived colour value ID 4) and categorized under the Black master colour (master colour ID 5). The item is in the Jersey Basic department (department number 1676) of Ladieswear (index code A). Specifically, it is in the Womens Everyday Basics section (section number 16). The garment group is Jersey Basic (garment group number 1002)."}
Customer Information Index (Index #2)
This is an index for storing basic information about every customer, taken directly from the H &M dataset. Similarly to the item information index, instead of storing every single attribute as a single field, I combined these attributes into a singular customer_info field.
For instance, given the following customer data:
{"customer_id":"00000dbacae5abe5e23885899a1fa44253a17956c6d1c3d25f88aa139fdfc657"," FN":null, "active":null, "club_member_status":"ACTIVE", "fashion_news_frequency":"NONE", "age":49, "postal_code":"52043ee2162cf5aa7ee79974281641c6f11a68d276429a91f8ca0d4b6efa8100"}
we store this in Elasticsearch with two columns:
{“customer_id”: 00000dbacae5abe5e23885899a1fa44253a17956c6d1c3d25f88aa139fdfc657, “customer_info”: Customer with ID 00000dbacae5abe5e23885899a1fa44253a17956c6d1c3d25f88aa139fdfc657 is a ACTIVE member. They are 49.0 years old. Their fashion news frequency is set to 'NONE'. Their postal code is '52043ee2162cf5aa7ee79974281641c6f11a68d276429a91f8ca0d4b6efa8100'. Their active status is nan and FN status is nan.}
领英推荐
Storing the data in this format into Elasticsearch makes it easily searchable and allows for users to use other customer attributes when searching.
Embedding Indices (Indices 3-6)
To store the embeddings for itemtouser, we will define two separate indices: one for customer embeddings and another for the item embeddings.
Both indices have the same structure of ID (customer or item), the embedding, and a text field describing either the customer or the item.
A similar structure is used for the itemtoitem embedding indices, where both embedding indices in this case are of the same structure as the cust_item_embeddings above.
Prediction Indices (7-8)
For both itemtouser and itemtoitem, we store two indices: predictions for every customer, and predictions for every item.
Similar to the item and customer information indices, predictions for every item/customer are formatted into one singular text field so that all predictions can be searched easily for any given item.
{“item_id”: 108775015, “formatted_predictions”: “The 1st prediction for item_id 108775015 is item_id 108775015 with score 317.7856140136719. Item 108775015 is called 'Strap top'... The 2nd prediction for item_id 108775015 is item_id 815456006 with score 238.77734375. Item 815456006 is called ‘Madison Slim Stretch Chino’... The 10th prediction for item_id 108775015 is…”
{“customer_id”: 00000dbacae5abe5e23885899a1fa44253a17956c6d1c3d25f88aa139fdfc657, “formatted_predictions”: “The 1st prediction for customer_id 00000dbacae5abe5e23885899a1fa44253a17956c6d1c3d25f88aa139fdfc657 is item_id 108775015 with score 317.7856140136719. Item 108775015 is called 'Strap top'... The 2nd prediction for customer_id 00000dbacae5abe5e23885899a1fa44253a17956c6d1c3d25f88aa139fdfc657 is item_id 815456006 with score 238.77734375. Item 815456006 is called ‘Madison Slim Stretch Chino’... The 10th prediction for customer_id 00000dbacae5abe5e23885899a1fa44253a17956c6d1c3d25f88aa139fdfc657 is..
3. Context Retrieval
Once data is structured in Elasticsearch, we have to figure out an appropriate way to retrieve context given a user query and display a final response. The process can be summarized as the following:
Classification Response (OpenAI)
The first step in the workflow is a classification response of the user query, either identifying it as an itemtoitem or an itemtouser prompt. This will help specify what indices to search for in Elasticsearch for relevant context for the user query.
This classification response is obtained by simply prompting GPT 4-o with the prompt “Please classify the following query as either an itemtoitem task or a itemtouser item task. Please only respond with either itemtoitem or itemtouser, with no additional commentary.”
Context Retrieval (Elasticsearch)
Now that we have classified the query as either itemtoitem or itemtouser, we have reached the most important part: retrieving the appropriate context. We have two main pieces of information stored in Elasticsearch: predictions for both customers and items, and embeddings.
Let’s start with retrieving the appropriate predictions! Using the classification response from GPT, we can direct the query to the appropriate indices depending on the classification.
For the specified indices for the given task, combine all relevant context.
Given an index and a user query, how is the appropriate context retrieved?
Context Retrieval (Embeddings)
Now, let us think about a query like “I just bought the Mariette Blazer. Recommend me a white item to buy” or “I am customer x. Recommend me a sweater I can buy based on my purchase history”.
These queries can be powered with Kumo predictions. We will search for the predictions for the specific item/customer, and then search for the specific attribute within the predictions (ex. white items in the predictions for Mariette Blazer or sweaters in customer x’s predictions)
However, how about if there isn’t a white item in the predictions for the Mariette Blazer, or there isn’t a sweater in customer x’s predictions? This is where the embeddings come in. Given a specific attribute/filter (ex. white, sweater), instead of searching for the attribute within the pre-computed predictions, we can use embeddings to do a filtered search.
The workflow for retrieving embeddings can be summarized into four major steps:
4. Retrieval Augmented Generation (RAG)
So far, we have:
Now that we have the relevant context, how can we use this to obtain a coherent response to display to the user? This is where RAG comes in.
The RAG approach can be broken down into three simple steps:
5. User Experience
Now that we have our final response, we want to display it to the user, and have an easy way for the user to interact with our agent. Streamlit provides a simple, elegant UI for users to interact with and get responses to their queries.
6. Evaluation
In order to test the accuracy of the responses returned, I created a test dataset of 50 queries: 25 itemtoitem and 25 itemtouser, varying in length and style.
Overall, the agent had 90% accuracy (24/25 for itemtouser and 21/25 for itemtoitem)!
On average:
This chatbot is just one of the many cool things that can be built using Kumo AI!
Next step: request a free trial to give it a shot!