Building a RAG System with Embeddings, Delphi and Firebird database

Building a RAG System with Embeddings, Delphi and Firebird database

Artificial intelligence (AI) has been around in the world of computing for a long time, but it has only recently exploded, mainly through the large language models (LLM) that have produced the magic of being able to speak to computers in the same way that we do with people, and also that these devices that work with 0 and 1, answer us in a way that we understand.

Recent development is such, that it is difficult to keep up with the pace of new models and techniques to obtain better and more precise answers.

In this article I show my tests to better understand how they work and to be able to build programs that take advantage of their possibilities. I will describe the process of creating a (RAG) system using embeddings generated by OpenAI, utilizing Delphi as programming tool and Firebird to store and manage the data

1. Understanding the problem

Large models are trained with huge amounts of data, which allows them to give us an answer to almost anything we ask them. Because of the way they construct that answer, it may not always be as precise as we are looking for. To improve the quality of the answers, and since AI often does not know about data from our businesses, we can provide context to our prompt that helps return the correct answer. This technique is known as Retrieval-Augmented Generation (RAG), and I will describe its steps in this article.

2. Embeddings

AI is based on neural networks, made up of a multitude of layers, each layer with its nodes, where a series of parameters are adjusted that use mathematical operations that transform a series of inputs into the desired output. An embedding in AI is a numerical representation of data (like text, images, or audio) in a high-dimensional vector space where similar items are positioned close to each other. Think of it as converting information into a list of numbers that capture the meaning or features of that data.

The embeddings alow us convert unstructured data into structured numerical format, enable machines to understand semantic relationships and allow mathematical operations among them.

Embedding stores are called vector databases, are specialized databases optimized for storing/querying high-dimensional vectors, vector indexing and optimized for similarity search (nearest neighbor search).

In this case I use the Firebird database, which is not a vector database, but has the necessary functionalities to test and better understand these systems. Firebird is a relational database that I have been using for many years, it is an open source project derived from Interbase, it requires little maintenance, it takes up little disk space, and it is actively maintained and improved.

3. Get embeddings from our documents

To obtain the embeddings of our documents we will query the OpenAI text-embedding-ada-002 model, which returns vectors with a fixed length of 1536 double values.

To consult the Open AI API, we need to have an account and have an API-key that we must send in the requests we make to it.

I query OpenAI in the Delphi IDE, which uses Object Pascal as a language. It is an environment that has been in the sector for many years and is continually updated with improvements in the language and the available tools. It compiles programs for Windows, iOS, MacOS, Android and Linux systems. It has an excellent set of components ready to use in your applications and excellent support for databases.

I use the Delphi VCL project application and the Firedac components for database operations. The project use clause requires this units:

uses
  Winapi.Windows, Winapi.Messages, System.SysUtils, System.Variants, System.Classes, System.Generics.Collections, System.Generics.Defaults, System.Math, Vcl.Graphics,
  Vcl.Controls, Vcl.Forms, Vcl.Dialogs, FireDAC.Stan.Intf, FireDAC.Stan.Option,
  FireDAC.Stan.Error, FireDAC.UI.Intf, FireDAC.Phys.Intf, FireDAC.Stan.Def,
  FireDAC.Stan.Pool, FireDAC.Stan.Async, FireDAC.Phys, FireDAC.VCLUI.Wait,
  FireDAC.Phys.FBDef, FireDAC.Phys.IBBase, FireDAC.Phys.FB, Data.DB,
  FireDAC.Comp.Client, FireDAC.Stan.Param, System.JSON,
  System.Net.URLClient, System.Net.HttpClient, System.Net.HttpClientComponent,
  Vcl.StdCtrls, FireDAC.VCLUI.Async, FireDAC.Comp.UI, FireDac.DApt,
  Vcl.Samples.Spin;        

To obtain the array of values that form the embedding of our text, we make a request of this type to end point https://api.openai.com/v1/embeddings OpenAI API:

function GetEmbedding(const Text: string): TArray<Double>;
var
  HttpClient: TNetHTTPClient;
  RequestBody: TStringStream;
  Response: IHTTPResponse;
  JSONObj: TJSONObject;
  EmbeddingData: TJSONArray;
  I: Integer;
begin
  HttpClient := TNetHTTPClient.Create(nil);
  try
    HttpClient.ContentType := 'application/json';
    HttpClient.CustomHeaders['Authorization'] := 'Bearer ' + ApiKey;

    RequestBody := TStringStream.Create(
      TJSONObject.Create
        .AddPair('model', 'text-embedding-ada-002')
        .AddPair('input', Text)
        .ToString,
      TEncoding.UTF8
    );

    try
      // Hacer la petición POST
      Response := HttpClient.Post('https://api.openai.com/v1/embeddings', RequestBody);

      if Response.StatusCode = 200 then
      begin
        JSONObj := TJSONObject.ParseJSONValue(Response.ContentAsString) as TJSONObject;
        try
          EmbeddingData := JSONObj.GetValue<TJSONArray>('data').Items[0].GetValue<TJSONArray>('embedding');
          SetLength(Result, EmbeddingData.Count);
          for I := 0 to EmbeddingData.Count - 1 do
            Result[I] := EmbeddingData.Items[I].AsType<Double>;
        finally
          JSONObj.Free;
        end;
      end else
      begin
        raise Exception.CreateFmt('Error getting the embedding. Status code: %d - %s', [Response.StatusCode, Response.StatusText]);
      end;
    finally
      RequestBody.Free;
    end;
  finally
    HttpClient.Free;
  end;
end;        

4. Save the embeddings in the Firebird database

The table I use to store the embeddings in Firebird has the following structure. I store the vector in an array-type field (I had read about its existence, but had never used it), to which I assign a size of 1536, which is the size of the vector returned by the model that generates the embeddings.

CREATE TABLE EMBEDDINGS (
    ID_EMBED  INTEGER GENERATED BY DEFAULT AS IDENTITY,
    TEXT      BLOB SUB_TYPE TEXT SEGMENT SIZE 400 ,
    VECTOREM  DOUBLE PRECISION [1:1536];
);

/******************************************************************************/
/*                          Autoincrement generators                          */
/******************************************************************************/

ALTER TABLE EMBEDDINGS ALTER ID_EMBED RESTART WITH 3;

/******************************************************************************/
/*                                Primary keys                                */
/******************************************************************************/

ALTER TABLE EMBEDDINGS ADD PRIMARY KEY (ID_EMBED);        

In order to save the array field in the embeddings table we use this code:

const
  Vector_Size = 1536;

procedure StoreEmbeddingInFirebirdArray(Connection: TFDConnection; const Text: string; const Embedding: TArray<Double>);
var
  Query: TFDQuery;
  I: Integer;
begin
  if Length(Embedding) <> Vector_Size then
    raise Exception.Create('Vector size is not correct. Must be 1536.');

  Query := TFDQuery.Create(nil);
  try
    Query.Connection := Connection;
    Query.SQL.Text := 'INSERT INTO EMBEDDINGS (TEXT, VECTOREM) VALUES (:text, :embedding)';
    Query.Params.ParamByName('text').AsString := Text;
    // TFDParam.DataTypeName must be set to
    // <Table name>.<Field name>
    Query.Params[1].DataTypeName := 'EMBEDDINGS.VECTOREM';
    // TFDParam.ArrayType must be set to atTable,
    // not to atArray
    Query.Params[1].ArrayType := atTable;
    // TFDParam.ArraySize must be set to array size
    Query.Params[1].ArraySize := Vector_Size;
    // Fill the array with the values
    for I := Low(Embedding) to High(Embedding) do
    begin
     Query.Params[1].AsFloats[I]:= Embedding[I];
    end;
    Query.ExecSQL;
  finally
    Query.Free;
  end;
end;        

5. Query the embedding closest to another

Objective: To find the most relevant existing context by comparing stored embeddings with the embedding of the user's query.

The first step is to find the most similar context available in our database by comparing the embeddings of stored texts with the embedding of the user query. We use the cosine similarity metric to determine which of the embeddings are most similar to the query.

To do this, the users query embedding is generated using OpenAI's API, and then a comparison is made between this embedding and the stored embeddings using a cosine similarity calculation. The result is a reference to the text with the closest matching context. The cosine method measures the angle between two vectors, it's a good choice to check text similarity and document comparison, there are other methods like euclidean distance, manhattan distance, etc.

//Mathematic operation to calculate similarity between vectors
function CosineSimilarity(const A, B: TArray<Double>): Double;
var
  DotProduct, MagnitudeA, MagnitudeB: Double;
  I: Integer;
begin
  DotProduct := 0;
  MagnitudeA := 0;
  MagnitudeB := 0;

  for I := Low(A) to High(A) do
  begin
    DotProduct := DotProduct + A[I] * B[I];
    MagnitudeA := MagnitudeA + A[I] * A[I];
    MagnitudeB := MagnitudeB + B[I] * B[I];
  end;

  if (MagnitudeA = 0) or (MagnitudeB = 0) then
    Result := 0
  else
    Result := DotProduct / (Sqrt(MagnitudeA) * Sqrt(MagnitudeB));
end;        

In Delphi, we achieve the querying to the Firebird database to retrieve all stored embeddings. The cosine similarity is then calculated programmatically in Delphi by iterating through the embeddings and finding the one with the highest similarity score.

First we retrive from OpenAPI the embedding from the user prompt, after it we find the most similar embedding form the table:

//Function to retrieve all the embeddings from database
function GetAllEmbeddingsFromDatabase(Connection: TFDConnection): TArray<TArray<Double>>;
var
  Query: TFDQuery;
  Embeddings: TArray<TArray<Double>>;
  EmbeddingArray: TArray<Double>;
  I, Count: Integer;
  F: TArrayField;
begin
  Query := TFDQuery.Create(nil);
  try
    Query.Connection := Connection;
    Query.SQL.Text := 'SELECT ID_EMBED, VECTOREM FROM EMBEDDINGS';
    Query.Open;
    while not Query.Eof do
    begin
     SetLength(EmbeddingArray, Vector_Size);
     F:= TArrayField(Query.FieldByName('VECTOREM'));
     for I := 0 to Vector_Size - 1 do
      begin
       EmbeddingArray[I]:= F.FieldValues[I];
      end;
      Embeddings := Embeddings + [EmbeddingArray];
      Query.Next;
    end;
    Result := Embeddings;
  finally
    Query.Free;
  end;
end;

function FindMostSimilarEmbedding(Connection: TFDConnection; const QueryEmbedding: TArray<Double>): Integer;
var
  AllEmbeddings: TArray<TArray<Double>>;
  MaxSimilarity, Similarity: Double;
  MostSimilarID, I: Integer;
  Query: TFDQuery;
begin
  // Get all embeddings from database
  AllEmbeddings := GetAllEmbeddingsFromDatabase(Connection);

  MaxSimilarity := -1;
  MostSimilarID := -1;

  Query := TFDQuery.Create(nil);
  try
    Query.Connection := Connection;
    Query.SQL.Text := 'SELECT ID_EMBED FROM EMBEDDINGS';
    Query.Open;
    I := 0;

    // Calcular la similitud coseno con cada embedding almacenado
    while not Query.Eof do
    begin
      Similarity := CosineSimilarity(QueryEmbedding, AllEmbeddings[I]);
      if Similarity > MaxSimilarity then
      begin
        MaxSimilarity := Similarity;
        MostSimilarID := Query.FieldByName('ID_EMBED').AsInteger;
      end;

      Query.Next;
      Inc(I);
    end;

    Result := MostSimilarID;

  finally
    Query.Free;
  end;
end;        

6. Send the request to the OpenAPI with prompt + context

Once the most relevant context has been identified, it is used to enhance the prompt that will be sent to the OpenAI API. The goal is to provide the model with additional background information to improve the accuracy and relevance of the generated response. The prompt is built by combining the retrieved context along with the user's query. This comprehensive prompt is then sent to OpenAI's API, specifically using the gpt-3.5-turbo model (or another suitable model), which will generate a response based on both the context and the query. This approach ensures that the generated answer is as precise and useful as possible.

In Delphi, we use the TNetHTTPClient component to make the HTTP request to OpenAI's API. Here is a sample Delphi function to send the request:

function ExecuteQueryWithContext(Connection: TFDConnection; const UserQuery: string): string;
var
  QueryEmbedding: TArray<Double>;
  MostRelevantContext, FinalPrompt: string;
begin
  QueryEmbedding := GetEmbedding(UserQuery);

  // Find the nearest context text
  MostRelevantContext := FindMostSimilarContext(Connection, QueryEmbedding);

  // Make the final prompt
  FinalPrompt := 'relevant context: ' + MostRelevantContext + sLineBreak +
                 'user prompt: ' + UserQuery + sLineBreak +
                 'Respuesta:';

  //Get answer from OpenAI
  result := GetOpenAIResponse(FinalPrompt);
end;        
function GetOpenAIResponse(const Prompt: string): string;
var
  HttpClient: TNetHTTPClient;
  RequestBody: TStringStream;
  Response: IHTTPResponse;
  JSONObj: TJSONObject;
  ChoicesArray: TJSONArray;
  CompletionText: string;
begin
  HttpClient := TNetHTTPClient.Create(nil);
  try
    HttpClient.ContentType := 'application/json';
    HttpClient.CustomHeaders['Authorization'] := 'Bearer ' + ApiKey;

    RequestBody := TStringStream.Create(
      TJSONObject.Create
        .AddPair('model', 'gpt-3.5-turbo')
        .AddPair('messages', TJSONArray.Create(
          TJSONObject.Create
            .AddPair('role', 'user')
            .AddPair('content', Prompt)
        ))
        .AddPair('max_tokens', 150)
        .AddPair('temperature', 0.7)
        .ToString,
      TEncoding.UTF8
    );

    try
      Response := HttpClient.Post('https://api.openai.com/v1/chat/completions', RequestBody);
      if Response.StatusCode = 200 then
      begin
        // Parsear la respuesta JSON
        JSONObj := TJSONObject.ParseJSONValue(Response.ContentAsString) as TJSONObject;
        try
          ChoicesArray := JSONObj.GetValue<TJSONArray>('choices');
          if ChoicesArray.Count > 0 then
            CompletionText := ChoicesArray.Items[0].GetValue<TJSONObject>('message').GetValue<string>('content')
          else
            CompletionText := '';
          Result := CompletionText;
        finally
          JSONObj.Free;
        end;
      end
      else
      begin
        raise Exception.CreateFmt('Error getting the answer. State code: %d - %s', [Response.StatusCode, Response.StatusText]);
      end;
    finally
      RequestBody.Free;
    end;
  finally
    HttpClient.Free;
  end;
end;        

Conclusion

The described process effectively demonstrates how to create a Retrieval-Augmented Generation (RAG) system by leveraging stored embeddings saved in a Firebird database. By retrieving the most relevant context and combining it with the user query, we can generate high-quality responses that are informed by previous knowledge.

This approach combines the power of retrieval-based methods with generative models, resulting in a system that not only provides relevant information but also creates contextually appropriate answers to user queries.

This article has been written as a first contact to understand how these systems work and how information is exchanged. There is a lot of room for improvement, such as indexing embedding tables with search words, segmenting the table by groups or clusters, using other embedding query methods, etc.

My first contact with AI from Delphi code was using the components Maker AI that I recomend to visit and use.



Gregory Koehn

Software Developer at Orion Law Management Systems, Inc

4 个月

Interesting article! I like your description of Delphi... "Delphi IDE, which uses Object Pascal as a language. It is an environment that has been in the sector for many years and is continually updated with improvements in the language and the available tools. It compiles programs for Windows, iOS, MacOS, Android and Linux systems. It has an excellent set of components ready to use in your applications and excellent support for databases."

要查看或添加评论,请登录

Alberto Fornés的更多文章

  • Perform data analysis with Delphi

    Perform data analysis with Delphi

    Delphi (Object Pascal) is a great programming language, and for many a great unknown or forgotten. This week it will be…

    3 条评论
  • Etiquetado alimentario: claves para entender algunos símbolos

    Etiquetado alimentario: claves para entender algunos símbolos

    El etiquetado alimentario cumple un rol fundamental en la orientación del consumidor a la hora de elegir qué productos…

  • Símbolos de Reciclaje: La Clave para una Gestión Sostenible de los Residuos

    Símbolos de Reciclaje: La Clave para una Gestión Sostenible de los Residuos

    A partir del 1 de enero de 2025 entrará en vigor la obligatoriedad de que los envases domésticos en Espa?a indiquen en…

  • The Role of Symbols in Product Labeling

    The Role of Symbols in Product Labeling

    In today's global marketplace, effective communication through product labeling is more important than ever. Symbols…

    4 条评论
  • Use PLABEL VCL as a report editor

    Use PLABEL VCL as a report editor

    Although the use of the PLABEL VCL 3.0 editor linked to data, is to design and print labels, it can also be used as an…

    1 条评论
  • [TIP] Save Delphi program options to an integer

    [TIP] Save Delphi program options to an integer

    It is a common requirement in programs to have to save several program options, or action permissions, to adapt it to…

  • New version PLABEL VCL 3.0?

    New version PLABEL VCL 3.0?

    Version 3.0 of the PLABEL VCL components has been released, this is a version written from scratch and using the Skia…

  • PLABEL VCL 3.0

    PLABEL VCL 3.0

    Nueva versión de los componentes Delphi para incluir un editor de etiquetas en sus programas, se ha a?adido un editor…

社区洞察

其他会员也浏览了