Use AI to Build Smarter Queries in Sitecore Search

Use AI to Build Smarter Queries in Sitecore Search

Hey sitecorians, hope you’re all doing great! New year and all – 2025 will be the year of AI ??

I’m in love! It all started during Microsoft’s .NET Conf: Focus on AI, and I saw the light. The AI light! ??


Semantic Kernel, Semantic Kernel, Semantic Kernel

Semantic Kernel is a SAVIOR, at least for us .NET developers ??

It will truly help you in your AI journey, a good friend!

Semantic Kernel is a lightweight, open-source development kit that lets you easily build AI agents and integrate the latest AI models into your C#, Python, or Java codebase. It serves as an efficient middleware that enables rapid delivery of enterprise-grade solutions. Semantic Kernel combines prompts with?existing APIs?to perform actions. By describing your existing code to AI models, they’ll be called to address requests. When a request is made the model calls a function, and Semantic Kernel is the middleware translating the model’s request to a function call and passes the results back to the model. By adding your existing code as a plugin, you’ll maximize your investment by flexibly integrating AI services through a set of out-of-the-box connectors. Semantic Kernel uses OpenAPI specifications (like Microsoft 365 Copilot) so you can share any extensions with other pro or low-code developers in your company.

From now on, Semantic Kernel is my best friend ??

*Read and be amazed about the wonderful Semantic Kernel here.

So that got me thinking, what if we could use AI (and Semantic Kernel) with Sitecore Search? For those who have never heard of Sitecore Search, shame on you! Sitecore Search is perfect when you need to have external data in your Sitecore solution. It’s also excellent for crawling websites.

Anyway, let’s continue ??

What if we could get AI to generate queries (Sitecore Search queries)? Instead of making the AI do searches, why not just make queries instead?

Well friends, that’s what this post is all about. ??

So, the idea is this: Imagine a typical search, like browsing for cars on a website.

Normally, in Sitecore Search, you’d make certain fields (attributes) like car make, model, or year as Textual Relevance. These help determine what appears in the search results, but manually setting these fields can be limiting.

Textual relevance refers to how strong a potential match is compared to the visitor’s search query. When you configure textual relevance, you tell Sitecore Search where in your content item it needs to look for matching terms and the relative importance it needs to give different content areas. For example, you might want Search to look for search terms in the values of the?title?attribute, the?description?attribute, or somewhere else. *https://doc.sitecore.com/search/en/users/search-user-guide/configure-an-attribute-for-textual-relevance.html

So what if we instead let AI create a query based on a search text, like:

I’m picky when it comes to cars. I want manufacturers that care about the environment, prioritize safety, and make cars accessible to everyone, including people with disabilities.

This makes search much cooler. It’s like having a conversation with the system instead of using old, boring search boxes we’ve always had.

The resulting Sitecore Search query could look something like this:

{
  "query": {
    "context": {
      "locale": {
        "country": "se",
        "language": "sv"
      }
    },
    "widget": {
      "items": [
        {
          "rfk_id": "42",
          "entity": "customentity",
          "sources": ["777777"],
          "search": {
            "filter": {
              "type": "and",
              "filters": [
                {
                  "type": "or",
                  "filters": [
                    { "type": "eq", "name": "make", "value": "Toyota" },
                    { "type": "eq", "name": "make", "value": "Tesla" },
                    { "type": "eq", "name": "make", "value": "Volvo" }
                  ]
                }
              ]
            },
            "sort": {
              "value": [
                { "name": "price", "order": "asc" }
              ]
            }
          }
        }
      ]
    }
  },
  "explanation": {
    "filters": [
      "make = Toyota, Tesla, Volvo - These manufacturers prioritize environmental care, safety, and accessibility."
    ],
    "sort": [
      "price ascending - Sort results by retail price in ascending order."
    ]
  }
}        

How does it work?

The magic happens of course in Semantic Kernel ??

Here’s the flow:

  1. The frontend (Next.js app) sends the user’s prompt to a REST API.
  2. The API uses Semantic Kernel to generate a Sitecore Search query.
  3. The query is returned to the frontend, which then performs the actual Sitecore Search.

I will show you parts of the code, especially the Semantic Kernel part. In my example I will make a plugin of the “search query generator” ??

Creating the AI Plugin: It’s All About the Prompts

To make AI work with Sitecore Search, the real secret lies in the prompts. A well-designed prompt ensures that the AI understands the context, follows specific rules, and generates the required JSON query structure.

Here’s how I did this:

  1. System Prompt: Defines the rules and structure for the AI to follow.
  2. Schema: Stores the necessary data structure and constraints to generate meaningful queries.
  3. Plugin Logic: Brings it all together, dynamically generating queries based on user input.


1. The System Prompt

The system prompt ensures the AI knows what to do, how to respond, and the constraints it must follow. Here’s the code:

private string GenerateSystemPrompt()
{
    return $@"
    {{$schema}}
    Today's date is {{$currentDate}}.
 
    You MUST return ONLY a JSON object with exactly two properties:
    {{
        ""query"": {{
            ""context"": {{
                ""locale"": {{
                    ""country"": ""se"",
                    ""language"": ""sv""
                }}
            }},
            ""widget"": {{
                ""items"": [
                    {{
                        ""rfk_id"": ""42"",
                        ""entity"": ""customentity"",
                        ""sources"": [""777777""],
                        ""search"": {{
                            // Your search configuration here
                        }}
                    }}
                ]
            }}
        }},
        ""explanation"": {{
            ""filters"": [
                // Array of filter explanations
            ],
            ""sort"": [
                // Array of sort explanations
            ]
        }}
    }}
 
    Rules:
    1. Convert any mentioned 'mil' to kilometers by multiplying by 10
    2. Return ONLY the JSON object, no additional text or explanations
    3. Do not include any markdown formatting or code blocks
    4. The response must be valid JSON
    5. Follow the exact structure shown above
 
    Example of a valid response:
    {{
        ""query"": {{
            // Your Sitecore Search query here
        }},
        ""explanation"": {{
            ""filters"": [
                ""mileage_km < 5000000 - Price must not exceed 5000000 SEK.""
            ],
            ""sort"": [
                ""price ascending - Sort the results by price in ascending order.""
            ]
        }}
    }}
    ";
}        

As I said earlier, it’s all about the prompt. The more detailed you are the better response ??From how to handle Swedish units like “mil” to specifying JSON formatting rules.


I had a lot of work here, tons of hair pulling to get it to return the right JSON??

2. The Schema in Memory

The schema, stored in memory, provides the structure and constraints for the query. Here’s an example schema for car search:

make (string): Car brands (all car brands in the world),
color (string): Color of the car, in Swedish (R?d, Bl?, etc.),
price (float): Price of the car (Swedish krona),
model (string): Car models (all car models in the world),
drive_type (string): Drive type in Swedish (Framhjulsdrift, etc.),
mileage_km (float): Mileage in kilometers (convert 'mil' to km by multiplying by 10),
fuel_type (string): Fuel type in Swedish (Bensin, Diesel, etc.),
year (string): Car model year,
gearbox (string): Gearbox in Swedish (Manuell, Automatisk, etc.),
body_type (string): Body type in Swedish (Sedan, Kombi, etc.),
...
 
Constraints:
- `rfk_id` is always '42',
- `entity` is always 'customentity',
- `locale` is always language 'se' and country 'sv',
- `sources` is always ["777777"].
 
Country-Specific Make Filters:
   - Sweden: Volvo, Saab
   - Germany: BMW, Audi, Mercedes
   - Japan: Toyota, Honda, Nissan
   - Italy: Ferrari, Lamborghini, Maserati
   - Korea: Hyundai, Kia, Genesis
   - France: Peugeot, Renault, Citroen
   - UK: Jaguar, Land Rover, Aston Martin
   - USA: Ford, Tesla, Chrysler
   - China: Geely, BYD, Great Wall        

Here’s how the schema is loaded into memory using the Semantic Kernel’s memory plugin:

var memoryStore = new VolatileMemoryStore();
var textMemory = new SemanticTextMemory(memoryStore, embeddingGenerator);
var memoryPlugin = new TextMemoryPlugin(textMemory);
 
var schemaManager = new SchemaManager(memoryPlugin);
await schemaManager.LoadSchemaFromFileAsync("carSchema", "carSchemaCollection", "Schemas/carSchema.txt");
 
services.AddSingleton<ISchemaManager>(schemaManager);        

3. The Plugin: Generating Queries

Here I created a search query generator plugin using Semantic Kernel. The plugin will dynamically creates queries based on the users input, respecting all the rules and structures needed for Sitecore Search.

The plugin combines the system prompt and schema to generate queries dynamically. Here’s part of the code – the interesting part ??

[KernelFunction("search_cars")]
[Description("Search for cars on the given query.")]
public async Task<object?> GenerateSitecoreQueryAsync(string userPrompt, string schemaKey, string collectionName)
{
    var lease = await RateLimiter.AcquireAsync(1);
    if (!lease.IsAcquired)
        return new { error = "Rate limit exceeded. Please try again later." };
 
    try
    {
        var cachedResponse = await _responseStore.GetCachedResponseAsync(userPrompt);
        if (!string.IsNullOrEmpty(cachedResponse))
            return DeserializeJson(cachedResponse);
 
        var schema = await LoadSchemaAsync(schemaKey, collectionName);
 
        if (string.IsNullOrWhiteSpace(schema))
            return new { error = "Schema not found or empty." };
 
        var systemPrompt = GenerateSystemPrompt();
        var queryPrompt = GenerateQueryPrompt(userPrompt);
 
        var queryFunction = _kernel.CreateFunctionFromPrompt(
            systemPrompt,
            functionName: "GenerateSitecoreQuery",
            description: "Generates a Sitecore Search query based on the given schema and user prompt."
        );
 
        var arguments = new KernelArguments
        {
            ["schema"] = schema,
            ["currentDate"] = DateTime.Now.ToString("yyyy-MM-dd"),
            ["userPrompt"] = queryPrompt
        };
 
        var result = await _kernel.InvokeAsync(queryFunction, arguments);
        var rawJsonString = result.GetValue<string>();
 
        if (string.IsNullOrWhiteSpace(rawJsonString))
            return new { error = "No result from chat completion." };
 
        await _responseStore.SaveResponseAsync(userPrompt, rawJsonString);
        return DeserializeJson(rawJsonString);
    }
    finally
    {
        lease.Dispose();
    }
}        

Calling the Plugin

Once the plugin is built and registered, it’s time to use it ?? In my Rest API, I call the plugin from a handler to generate queries for Sitecore Search.

Here’s how the handler uses the plugin:

using FluentValidation;
using Microsoft.SemanticKernel;
 
public static class PromptSearchQueryGeneratorHandler
{
    public static async Task<IResult> Handle(
        Kernel kernel,
        PromptSearchQueryGeneratorRequest request,
        ILoggerFactory loggerFactory,
        IValidator<PromptSearchQueryGeneratorRequest> validator
    )
    {
        // Validate the request
        var validationResult = await validator.ValidateAsync(request);
        if (!validationResult.IsValid)
            return TypedResults.BadRequest(validationResult.Errors);
 
        var logger = loggerFactory.CreateLogger("PromptSearchQueryGeneratorHandler");
 
        // Get the SitecoreSearchGeneratorPlugin
        var pluginFunction = kernel.Plugins.GetFunction("SitecoreSearchGeneratorPlugin", "search_cars");
        if (pluginFunction == null)
        {
            logger.LogError("SitecoreSearchGeneratorPlugin not found.");
            return TypedResults.BadRequest("Plugin not found.");
        }
 
        try
        {
            // Invoke the plugin function
            var result = await pluginFunction.InvokeAsync<string>(
                kernel,
                new KernelArguments
                {
                    { "prompt", request.Prompt },
                    { "schemaKey", $"{request.SchemaType}Schema" },
                    { "collectionName", $"{request.SchemaType}SchemaCollection" }
                });
 
            return TypedResults.Ok(result);
        }
        catch (Exception ex)
        {
            logger.LogError(ex, "Error generating Sitecore Search query.");
            return TypedResults.BadRequest("Error generating query.");
        }
    }
}        

And that’s it ??

The power of using AI is that you now have support for other languages, you can do searches in all kind of languages ??

Here are some more interesting responses from searches:

SWEDISH:

Min pappa fyller snart 80. Han har d?ligt m?rkerseende och l?ngsamma reflexer. Hj?lp mig hitta en bil som passar honom

{
  "query": {
    "context": {
      "locale": {
        "country": "se",
        "language": "sv"
      }
    },
    "widget": {
      "items": [
        {
          "rfk_id": "42",
          "entity": "customentity",
          "sources": [
            "777777"
          ],
          "search": {
            "content": {},
            "filter": {
              "type": "and",
              "filters": [
                {
                  "type": "in",
                  "name": "drive_type",
                  "values": [
                    "Automatisk",
                    "Fyrhjulsdrift",
                    "Framhjulsdrift"
                  ]
                },
                {
                  "type": "eq",
                  "name": "condition",
                  "value": "Ny"
                }
              ]
            },
            "sort": {
              "value": [
                {
                  "name": "price",
                  "order": "asc"
                }
              ]
            }
          }
        }
      ]
    }
  },
  "explanation": {
    "filters": [
      "drive_type in ['Automatisk', 'Fyrhjulsdrift', 'Framhjulsdrift'] - Automatic and easy-to-drive cars are prioritized.",
      "condition = 'Ny' - Only new cars are considered to provide the best safety features."
    ],
    "sort": [
      "price ascending - Sort the results by retail price in ascending order."
    ]
  }
}        

GERMAN:

Ich m?chte ein Auto, das nicht mehr als 2000 mil gefahren ist, es muss rot sein und aus Deutschland kommen

{
  "query": {
    "context": {
      "locale": {
        "country": "se",
        "language": "sv"
      }
    },
    "widget": {
      "items": [
        {
          "rfk_id": "42",
          "entity": "customentity",
          "sources": [
            "777777"
          ],
          "search": {
            "content": {},
            "filter": {
              "type": "and",
              "filters": [
                {
                  "type": "eq",
                  "name": "color",
                  "value": "R?d"
                },
                {
                  "type": "lt",
                  "name": "mileage_km",
                  "value": 20000
                },
                {
                  "type": "or",
                  "filters": [
                    {
                      "type": "eq",
                      "name": "make",
                      "value": "BMW"
                    },
                    {
                      "type": "eq",
                      "name": "make",
                      "value": "Audi"
                    },
                    {
                      "type": "eq",
                      "name": "make",
                      "value": "Mercedes"
                    }
                  ]
                }
              ]
            },
            "sort": {
              "value": [
                {
                  "name": "price",
                  "order": "asc"
                }
              ]
            }
          }
        }
      ]
    }
  },
  "explanation": {
    "filters": [
      "color eq R?d - The car must be red.",
      "mileage_km lt 20000 - The car must not have more than 2000 mil (20000 km) mileage.",
      "make eq BMW, Audi, Mercedes - The car must be a German brand: BMW, Audi, or Mercedes."
    ],
    "sort": [
      "price ascending - Sort the results by retail price in ascending order."
    ]
  }
}        

KOREAN:

?? ?? ???? ?????. ??? 2? ?? ?? ?? ??? ???

{
  "query": {
    "context": {
      "locale": {
        "country": "se",
        "language": "sv"
      }
    },
    "widget": {
      "items": [
        {
          "rfk_id": "42",
          "entity": "customentity",
          "sources": [
            "777777"
          ],
          "search": {
            "content": {},
            "filter": {
              "type": "and",
              "filters": [
                {
                  "type": "or",
                  "filters": [
                    {
                      "type": "eq",
                      "name": "make",
                      "value": "Hyundai"
                    },
                    {
                      "type": "eq",
                      "name": "make",
                      "value": "Kia"
                    },
                    {
                      "type": "eq",
                      "name": "make",
                      "value": "Genesis"
                    }
                  ]
                },
                {
                  "type": "gte",
                  "name": "power_hp",
                  "value": 100
                },
                {
                  "type": "gte",
                  "name": "year",
                  "value": "2023"
                }
              ]
            },
            "sort": {
              "value": [
                {
                  "name": "price",
                  "order": "asc"
                }
              ]
            }
          }
        }
      ]
    }
  },
  "explanation": {
    "filters": [
      "make: Hyundai, Kia, Genesis - Search for cars from Korean manufacturers.",
      "power_hp >= 100 - The car must be fast with at least some horsepower.",
      "year >= 2023 - The car must be 2 years old or newer."
    ],
    "sort": [
      "price ascending - Sort the results by price in ascending order."
    ]
  }
}        

This is just the tip of the iceberg of what you can do with AI! Why stop at text, why not speech to text… There is so much more we can do here, with AI you can do EVERYTHING!

If you are interested in knowing more or have some ideas you want me to help you with, just ping me and let me help you on your AI journey.


That’s all for now folks???

要查看或添加评论,请登录

G?ran Halvarsson的更多文章

社区洞察

其他会员也浏览了