Spring AI and custom JSON data
Most of the Spring AI examples use "Dad Jokes" to demonstrate the capabilities of integrating with LLMs. I have nothing against dad jokes; I actually enjoy them. However, I have a different use case for using Spring AI.
I have the following requirements:
Looking at the above, Ollama looked like the obvious choice. Throw in a PGVector DB for RAG and we are off to the races. Setting up Ollama locally was a breeze, the LLama 3.2 LLM next and jokes were forthcoming!
One thing that becomes clear after the first few attempts is that LLMs have a strong affinity for text—lots of it! Feed them paragraphs from PDFs or other documents, and they can efficiently process and summarize the content. They also handle JSON payloads with a few nodes and lengthy descriptions quite effectively.
However, my payloads are OCR data being extracted when a customer registers with us. Here is an example of what it can look like:
{
"msisdn": "27782014637",
"first-name": "Aandre",
"last-name": "Kapp",
"gender": "MALE",
"email-address":"[email protected]",
"date-of-birth": "1999/04/12",
"identification-number": "9904120000008",
"identification-id-type":"SA_ID",
"identification-id-country":"ZA",
"street-address": " 1 Spine Road ",
"street-suburb": "Trist?o",
"street-city": "Bloemfontein",
"street-province": "WESTERN CAPE",
"postal-code": "9230"
}
For the first attempt to add this information to the RAG database, I attempted to use the JsonReader provided by Spring AI.
public void addCustomerData(Resource jsonResource) {
JsonReader jsonReader = new JsonReader(jsonResource, new CustomerMetadataGenerator(), "customer-id", "msisdn", "first-name", "last-name");
var documentList = jsonReader.get();
vectorStore.add(documentList);
}
public class CustomerMetadataGenerator implements JsonMetadataGenerator {
@Override
public Map<String, Object> generate(Map<String, Object> jsonMap) {
Map<String, Object> objectMap = Map.of("msisdn", jsonMap.get("msisdn"),
"first-name", jsonMap.get("first-name"),
"last-name", jsonMap.get("last-name"));
return objectMap;
}
}
On the first attempt to query the solution—boom! It flat-out refuses to answer the questions. The main issue lies in the restrictions imposed on these LLMs.
领英推荐
In comes Dolphin 2.9, and we have answers!
This is the initial response after processing the data mentioned above:
"Where does andre kapp lives? If you can't find his address, then return the date-of-birth infomation if you can find it"
Based on the context provided, I am unable to find any information about Andre Kapp's place of residence or date-of-birth. Therefore, I regret that I'm unable to assist with this request as there is not enough data available for me to provide you with an accurate answer. Please provide more relevant information if it becomes available so that I can better assist you in the future.
Modifying the JsonReader to omit the jsonKeysToUse option led to a significantly improved response.
JsonReader jsonReader = new JsonReader(jsonResource, new CustomerMetadataGenerator());
This looks better now. Still, plenty to test and play with.
Based on the provided context, I found that Andre Kapp lives at 1 Spine Road, Trist?o, Bloemfontein, Western Cape, South Africa, with a postal code of 9230.
Anybody with advice on how to improve this - please shout!