登录查看更多内容

Building a RAG System with Embeddings, Delphi and Firebird database

Alberto Fornés

Gerente en MESURASOFT S.L.U. Programador Delphi y .NET

发布日期: 2024年10月20日

Artificial intelligence (AI) has been around in the world of computing for a long time, but it has only recently exploded, mainly through the large language models (LLM) that have produced the magic of being able to speak to computers in the same way that we do with people, and also that these devices that work with 0 and 1, answer us in a way that we understand.

Recent development is such, that it is difficult to keep up with the pace of new models and techniques to obtain better and more precise answers.

In this article I show my tests to better understand how they work and to be able to build programs that take advantage of their possibilities. I will describe the process of creating a (RAG) system using embeddings generated by OpenAI, utilizing Delphi as programming tool and Firebird to store and manage the data

1. Understanding the problem

Large models are trained with huge amounts of data, which allows them to give us an answer to almost anything we ask them. Because of the way they construct that answer, it may not always be as precise as we are looking for. To improve the quality of the answers, and since AI often does not know about data from our businesses, we can provide context to our prompt that helps return the correct answer. This technique is known as Retrieval-Augmented Generation (RAG), and I will describe its steps in this article.

2. Embeddings

AI is based on neural networks, made up of a multitude of layers, each layer with its nodes, where a series of parameters are adjusted that use mathematical operations that transform a series of inputs into the desired output. An embedding in AI is a numerical representation of data (like text, images, or audio) in a high-dimensional vector space where similar items are positioned close to each other. Think of it as converting information into a list of numbers that capture the meaning or features of that data.

The embeddings alow us convert unstructured data into structured numerical format, enable machines to understand semantic relationships and allow mathematical operations among them.

Embedding stores are called vector databases, are specialized databases optimized for storing/querying high-dimensional vectors, vector indexing and optimized for similarity search (nearest neighbor search).

In this case I use the Firebird database, which is not a vector database, but has the necessary functionalities to test and better understand these systems. Firebird is a relational database that I have been using for many years, it is an open source project derived from Interbase, it requires little maintenance, it takes up little disk space, and it is actively maintained and improved.

3. Get embeddings from our documents

To obtain the embeddings of our documents we will query the OpenAI text-embedding-ada-002 model, which returns vectors with a fixed length of 1536 double values.

To consult the Open AI API, we need to have an account and have an API-key that we must send in the requests we make to it.

I query OpenAI in the Delphi IDE, which uses Object Pascal as a language. It is an environment that has been in the sector for many years and is continually updated with improvements in the language and the available tools. It compiles programs for Windows, iOS, MacOS, Android and Linux systems. It has an excellent set of components ready to use in your applications and excellent support for databases.

I use the Delphi VCL project application and the Firedac components for database operations. The project use clause requires this units:

uses
  Winapi.Windows, Winapi.Messages, System.SysUtils, System.Variants, System.Classes, System.Generics.Collections, System.Generics.Defaults, System.Math, Vcl.Graphics,
  Vcl.Controls, Vcl.Forms, Vcl.Dialogs, FireDAC.Stan.Intf, FireDAC.Stan.Option,
  FireDAC.Stan.Error, FireDAC.UI.Intf, FireDAC.Phys.Intf, FireDAC.Stan.Def,
  FireDAC.Stan.Pool, FireDAC.Stan.Async, FireDAC.Phys, FireDAC.VCLUI.Wait,
  FireDAC.Phys.FBDef, FireDAC.Phys.IBBase, FireDAC.Phys.FB, Data.DB,
  FireDAC.Comp.Client, FireDAC.Stan.Param, System.JSON,
  System.Net.URLClient, System.Net.HttpClient, System.Net.HttpClientComponent,
  Vcl.StdCtrls, FireDAC.VCLUI.Async, FireDAC.Comp.UI, FireDac.DApt,
  Vcl.Samples.Spin;

To obtain the array of values that form the embedding of our text, we make a request of this type to end point https://api.openai.com/v1/embeddings OpenAI API:

function GetEmbedding(const Text: string): TArray<Double>;
var
  HttpClient: TNetHTTPClient;
  RequestBody: TStringStream;
  Response: IHTTPResponse;
  JSONObj: TJSONObject;
  EmbeddingData: TJSONArray;
  I: Integer;
begin
  HttpClient := TNetHTTPClient.Create(nil);
  try
    HttpClient.ContentType := 'application/json';
    HttpClient.CustomHeaders['Authorization'] := 'Bearer ' + ApiKey;

    RequestBody := TStringStream.Create(
      TJSONObject.Create
        .AddPair('model', 'text-embedding-ada-002')
        .AddPair('input', Text)
        .ToString,
      TEncoding.UTF8
    );

    try
      // Hacer la petición POST
      Response := HttpClient.Post('https://api.openai.com/v1/embeddings', RequestBody);

      if Response.StatusCode = 200 then
      begin
        JSONObj := TJSONObject.ParseJSONValue(Response.ContentAsString) as TJSONObject;
        try
          EmbeddingData := JSONObj.GetValue<TJSONArray>('data').Items[0].GetValue<TJSONArray>('embedding');
          SetLength(Result, EmbeddingData.Count);
          for I := 0 to EmbeddingData.Count - 1 do
            Result[I] := EmbeddingData.Items[I].AsType<Double>;
        finally
          JSONObj.Free;
        end;
      end else
      begin
        raise Exception.CreateFmt('Error getting the embedding. Status code: %d - %s', [Response.StatusCode, Response.StatusText]);
      end;
    finally
      RequestBody.Free;
    end;
  finally
    HttpClient.Free;
  end;
end;

4. Save the embeddings in the Firebird database

The table I use to store the embeddings in Firebird has the following structure. I store the vector in an array-type field (I had read about its existence, but had never used it), to which I assign a size of 1536, which is the size of the vector returned by the model that generates the embeddings.

领英推荐

Importance of Frameworks in AI

Analytics Insight? 7 个月前

Importance of Frameworks in AI

Analytics Insight? 8 个月前

??Top ML Papers of the Week

DAIR.AI 6 个月前

CREATE TABLE EMBEDDINGS (
    ID_EMBED  INTEGER GENERATED BY DEFAULT AS IDENTITY,
    TEXT      BLOB SUB_TYPE TEXT SEGMENT SIZE 400 ,
    VECTOREM  DOUBLE PRECISION [1:1536];
);

/******************************************************************************/
/*                          Autoincrement generators                          */
/******************************************************************************/

ALTER TABLE EMBEDDINGS ALTER ID_EMBED RESTART WITH 3;

/******************************************************************************/
/*                                Primary keys                                */
/******************************************************************************/

ALTER TABLE EMBEDDINGS ADD PRIMARY KEY (ID_EMBED);

In order to save the array field in the embeddings table we use this code:

const
  Vector_Size = 1536;

procedure StoreEmbeddingInFirebirdArray(Connection: TFDConnection; const Text: string; const Embedding: TArray<Double>);
var
  Query: TFDQuery;
  I: Integer;
begin
  if Length(Embedding) <> Vector_Size then
    raise Exception.Create('Vector size is not correct. Must be 1536.');

  Query := TFDQuery.Create(nil);
  try
    Query.Connection := Connection;
    Query.SQL.Text := 'INSERT INTO EMBEDDINGS (TEXT, VECTOREM) VALUES (:text, :embedding)';
    Query.Params.ParamByName('text').AsString := Text;
    // TFDParam.DataTypeName must be set to
    // <Table name>.<Field name>
    Query.Params[1].DataTypeName := 'EMBEDDINGS.VECTOREM';
    // TFDParam.ArrayType must be set to atTable,
    // not to atArray
    Query.Params[1].ArrayType := atTable;
    // TFDParam.ArraySize must be set to array size
    Query.Params[1].ArraySize := Vector_Size;
    // Fill the array with the values
    for I := Low(Embedding) to High(Embedding) do
    begin
     Query.Params[1].AsFloats[I]:= Embedding[I];
    end;
    Query.ExecSQL;
  finally
    Query.Free;
  end;
end;

5. Query the embedding closest to another

Objective: To find the most relevant existing context by comparing stored embeddings with the embedding of the user's query.

The first step is to find the most similar context available in our database by comparing the embeddings of stored texts with the embedding of the user query. We use the cosine similarity metric to determine which of the embeddings are most similar to the query.

To do this, the users query embedding is generated using OpenAI's API, and then a comparison is made between this embedding and the stored embeddings using a cosine similarity calculation. The result is a reference to the text with the closest matching context. The cosine method measures the angle between two vectors, it's a good choice to check text similarity and document comparison, there are other methods like euclidean distance, manhattan distance, etc.

//Mathematic operation to calculate similarity between vectors
function CosineSimilarity(const A, B: TArray<Double>): Double;
var
  DotProduct, MagnitudeA, MagnitudeB: Double;
  I: Integer;
begin
  DotProduct := 0;
  MagnitudeA := 0;
  MagnitudeB := 0;

  for I := Low(A) to High(A) do
  begin
    DotProduct := DotProduct + A[I] * B[I];
    MagnitudeA := MagnitudeA + A[I] * A[I];
    MagnitudeB := MagnitudeB + B[I] * B[I];
  end;

  if (MagnitudeA = 0) or (MagnitudeB = 0) then
    Result := 0
  else
    Result := DotProduct / (Sqrt(MagnitudeA) * Sqrt(MagnitudeB));
end;

In Delphi, we achieve the querying to the Firebird database to retrieve all stored embeddings. The cosine similarity is then calculated programmatically in Delphi by iterating through the embeddings and finding the one with the highest similarity score.

First we retrive from OpenAPI the embedding from the user prompt, after it we find the most similar embedding form the table:

//Function to retrieve all the embeddings from database
function GetAllEmbeddingsFromDatabase(Connection: TFDConnection): TArray<TArray<Double>>;
var
  Query: TFDQuery;
  Embeddings: TArray<TArray<Double>>;
  EmbeddingArray: TArray<Double>;
  I, Count: Integer;
  F: TArrayField;
begin
  Query := TFDQuery.Create(nil);
  try
    Query.Connection := Connection;
    Query.SQL.Text := 'SELECT ID_EMBED, VECTOREM FROM EMBEDDINGS';
    Query.Open;
    while not Query.Eof do
    begin
     SetLength(EmbeddingArray, Vector_Size);
     F:= TArrayField(Query.FieldByName('VECTOREM'));
     for I := 0 to Vector_Size - 1 do
      begin
       EmbeddingArray[I]:= F.FieldValues[I];
      end;
      Embeddings := Embeddings + [EmbeddingArray];
      Query.Next;
    end;
    Result := Embeddings;
  finally
    Query.Free;
  end;
end;

function FindMostSimilarEmbedding(Connection: TFDConnection; const QueryEmbedding: TArray<Double>): Integer;
var
  AllEmbeddings: TArray<TArray<Double>>;
  MaxSimilarity, Similarity: Double;
  MostSimilarID, I: Integer;
  Query: TFDQuery;
begin
  // Get all embeddings from database
  AllEmbeddings := GetAllEmbeddingsFromDatabase(Connection);

  MaxSimilarity := -1;
  MostSimilarID := -1;

  Query := TFDQuery.Create(nil);
  try
    Query.Connection := Connection;
    Query.SQL.Text := 'SELECT ID_EMBED FROM EMBEDDINGS';
    Query.Open;
    I := 0;

    // Calcular la similitud coseno con cada embedding almacenado
    while not Query.Eof do
    begin
      Similarity := CosineSimilarity(QueryEmbedding, AllEmbeddings[I]);
      if Similarity > MaxSimilarity then
      begin
        MaxSimilarity := Similarity;
        MostSimilarID := Query.FieldByName('ID_EMBED').AsInteger;
      end;

      Query.Next;
      Inc(I);
    end;

    Result := MostSimilarID;

  finally
    Query.Free;
  end;
end;

6. Send the request to the OpenAPI with prompt + context

Once the most relevant context has been identified, it is used to enhance the prompt that will be sent to the OpenAI API. The goal is to provide the model with additional background information to improve the accuracy and relevance of the generated response. The prompt is built by combining the retrieved context along with the user's query. This comprehensive prompt is then sent to OpenAI's API, specifically using the gpt-3.5-turbo model (or another suitable model), which will generate a response based on both the context and the query. This approach ensures that the generated answer is as precise and useful as possible.

In Delphi, we use the TNetHTTPClient component to make the HTTP request to OpenAI's API. Here is a sample Delphi function to send the request:

function ExecuteQueryWithContext(Connection: TFDConnection; const UserQuery: string): string;
var
  QueryEmbedding: TArray<Double>;
  MostRelevantContext, FinalPrompt: string;
begin
  QueryEmbedding := GetEmbedding(UserQuery);

  // Find the nearest context text
  MostRelevantContext := FindMostSimilarContext(Connection, QueryEmbedding);

  // Make the final prompt
  FinalPrompt := 'relevant context: ' + MostRelevantContext + sLineBreak +
                 'user prompt: ' + UserQuery + sLineBreak +
                 'Respuesta:';

  //Get answer from OpenAI
  result := GetOpenAIResponse(FinalPrompt);
end;

function GetOpenAIResponse(const Prompt: string): string;
var
  HttpClient: TNetHTTPClient;
  RequestBody: TStringStream;
  Response: IHTTPResponse;
  JSONObj: TJSONObject;
  ChoicesArray: TJSONArray;
  CompletionText: string;
begin
  HttpClient := TNetHTTPClient.Create(nil);
  try
    HttpClient.ContentType := 'application/json';
    HttpClient.CustomHeaders['Authorization'] := 'Bearer ' + ApiKey;

    RequestBody := TStringStream.Create(
      TJSONObject.Create
        .AddPair('model', 'gpt-3.5-turbo')
        .AddPair('messages', TJSONArray.Create(
          TJSONObject.Create
            .AddPair('role', 'user')
            .AddPair('content', Prompt)
        ))
        .AddPair('max_tokens', 150)
        .AddPair('temperature', 0.7)
        .ToString,
      TEncoding.UTF8
    );

    try
      Response := HttpClient.Post('https://api.openai.com/v1/chat/completions', RequestBody);
      if Response.StatusCode = 200 then
      begin
        // Parsear la respuesta JSON
        JSONObj := TJSONObject.ParseJSONValue(Response.ContentAsString) as TJSONObject;
        try
          ChoicesArray := JSONObj.GetValue<TJSONArray>('choices');
          if ChoicesArray.Count > 0 then
            CompletionText := ChoicesArray.Items[0].GetValue<TJSONObject>('message').GetValue<string>('content')
          else
            CompletionText := '';
          Result := CompletionText;
        finally
          JSONObj.Free;
        end;
      end
      else
      begin
        raise Exception.CreateFmt('Error getting the answer. State code: %d - %s', [Response.StatusCode, Response.StatusText]);
      end;
    finally
      RequestBody.Free;
    end;
  finally
    HttpClient.Free;
  end;
end;

Conclusion

The described process effectively demonstrates how to create a Retrieval-Augmented Generation (RAG) system by leveraging stored embeddings saved in a Firebird database. By retrieving the most relevant context and combining it with the user query, we can generate high-quality responses that are informed by previous knowledge.

This approach combines the power of retrieval-based methods with generative models, resulting in a system that not only provides relevant information but also creates contextually appropriate answers to user queries.

This article has been written as a first contact to understand how these systems work and how information is exchanged. There is a lot of room for improvement, such as indexing embedding tables with search words, segmenting the table by groups or clusters, using other embedding query methods, etc.

My first contact with AI from Delphi code was using the components Maker AI that I recomend to visit and use.

Gregory Koehn

Software Developer at Orion Law Management Systems, Inc

4 个月

Interesting article! I like your description of Delphi... "Delphi IDE, which uses Object Pascal as a language. It is an environment that has been in the sector for many years and is continually updated with improvements in the language and the available tools. It compiles programs for Windows, iOS, MacOS, Android and Linux systems. It has an excellent set of components ready to use in your applications and excellent support for databases."

1 次回应

要查看或添加评论，请登录

Alberto Fornés的更多文章

Perform data analysis with Delphi

2025年2月10日

Perform data analysis with Delphi

Delphi (Object Pascal) is a great programming language, and for many a great unknown or forgotten. This week it will be…

3 条评论
Etiquetado alimentario: claves para entender algunos símbolos

2024年12月16日

Etiquetado alimentario: claves para entender algunos símbolos

El etiquetado alimentario cumple un rol fundamental en la orientación del consumidor a la hora de elegir qué productos…
Símbolos de Reciclaje: La Clave para una Gestión Sostenible de los Residuos

2024年12月11日

Símbolos de Reciclaje: La Clave para una Gestión Sostenible de los Residuos

A partir del 1 de enero de 2025 entrará en vigor la obligatoriedad de que los envases domésticos en Espa?a indiquen en…
The Role of Symbols in Product Labeling

2024年9月23日

The Role of Symbols in Product Labeling

In today's global marketplace, effective communication through product labeling is more important than ever. Symbols…

4 条评论
Use PLABEL VCL as a report editor

2024年9月16日

Use PLABEL VCL as a report editor

Although the use of the PLABEL VCL 3.0 editor linked to data, is to design and print labels, it can also be used as an…

1 条评论
[TIP] Save Delphi program options to an integer

2024年9月5日

[TIP] Save Delphi program options to an integer

It is a common requirement in programs to have to save several program options, or action permissions, to adapt it to…
New version PLABEL VCL 3.0?

2024年8月28日

New version PLABEL VCL 3.0?

Version 3.0 of the PLABEL VCL components has been released, this is a version written from scratch and using the Skia…
PLABEL VCL 3.0

2024年8月23日

PLABEL VCL 3.0

Nueva versión de los componentes Delphi para incluir un editor de etiquetas en sus programas, se ha a?adido un editor…

See all articles

Building a RAG System with Embeddings, Delphi and Firebird database

Alberto Fornés

Gerente en MESURASOFT S.L.U. Programador Delphi y .NET

1. Understanding the problem

2. Embeddings

3. Get embeddings from our documents

4. Save the embeddings in the Firebird database

领英推荐

5. Query the embedding closest to another

6. Send the request to the OpenAPI with prompt + context

Alberto Fornés的更多文章

社区洞察

其他会员也浏览了

How to Write an Algorithm?

Building an Effective AI Tech Stack for Success

10 Best AI Frameworks for Developers

Breakthroughs of the new Microsoft Research's rStar-Math

Crafting Cypher Queries from Question Phrases

AI Development: How to Build a Neural Network & Make Predictions using Java and Python

My 10 Steps Study Plan to Prepare for the Certified AI Scientist (CAIS?) Exam

Creating a Simple Neural Network with Keras, TensorFlow, Python and SQL Server

Small Language Model improves performance for code generation using AI

Reasons Why You Will Need Linear Algebra as a Data Scientist

1. Understanding the problem

2. Embeddings

3. Get embeddings from our documents

4. Save the embeddings in the Firebird database

领英推荐

5. Query the embedding closest to another

6. Send the request to the OpenAPI with prompt + context

Alberto Fornés的更多文章

Perform data analysis with Delphi

Etiquetado alimentario: claves para entender algunos símbolos

Símbolos de Reciclaje: La Clave para una Gestión Sostenible de los Residuos

The Role of Symbols in Product Labeling

Use PLABEL VCL as a report editor

[TIP] Save Delphi program options to an integer

New version PLABEL VCL 3.0?

PLABEL VCL 3.0

社区洞察

其他会员也浏览了

How to Write an Algorithm?

Building an Effective AI Tech Stack for Success

10 Best AI Frameworks for Developers

Breakthroughs of the new Microsoft Research's rStar-Math

Crafting Cypher Queries from Question Phrases

AI Development: How to Build a Neural Network & Make Predictions using Java and Python

My 10 Steps Study Plan to Prepare for the Certified AI Scientist (CAIS?) Exam

Creating a Simple Neural Network with Keras, TensorFlow, Python and SQL Server

Small Language Model improves performance for code generation using AI

Reasons Why You Will Need Linear Algebra as a Data Scientist