登录查看更多内容

AI and Java with Spring Boot

Shlomo Goldshtein

Chief Software Architect | R&D Executive | Cloud & AI Strategy | Microservices & CI/CD | Digital Transformation

发布日期: 2024年8月26日

When people start coding AI applications, they may often think that Python is the only viable language for it. It’s true that Python offers many advantages as the leading language for AI development, with an extensive ecosystem of open-source libraries such as LangChain and LangGraph. Most of the examples and tutorials available online are also in Python. In fact, when I began exploring AI and large language models (LLMs), Python was my first choice as well.

However, having coded in Java for over 25 years, I have a deep appreciation for the language. Moreover, there are practical considerations for organizations with legacy systems that cannot easily adopt a polyglot approach. For these systems, Java may remain a necessity, especially when incorporating AI into existing infrastructure.

When Spring Boot introduced its AI support, I seized the opportunity to create a proof of concept (POC) to explore its capabilities.

Overview of My POC

My POC tested three types of solutions:

Chatbot: An application that interacts with an LLM and displays the results.
Retrieval-Augmented Generation (RAG): This application allows users to upload documents, which are then chunked, embedded (converted into vectors), and stored in an in-memory vector database. When the user submits a query, the system searches for relevant documents and uses them as context for the LLM.
Function Calling: This feature enables the LLM to invoke tools (or functions) in cases where it needs additional information, such as querying a corporate database for employee details.

Key Technologies

To limit external dependencies, I used only Spring Boot AI libraries, although other Java-based options like LangChain4J are available. For local LLM execution, I utilized Ollama to run the models on my computer, but I also tested with GroqCloud (Spring AI doesn't have a specific library for Groq, but it supports the OpenAI protocol using the API endpoint https://api.groq.com/openai).

Setting Up the Project

First, you need to add the following repositories to your Maven pom.xml:

<repositories>
    <repository>
        <id>spring-milestones</id>
        <name>Spring Milestones</name>
        <url>https://repo.spring.io/milestone</url>
        <snapshots>
            <enabled>false</enabled>
        </snapshots>
    </repository>
    <repository>
        <id>spring-snapshots</id>
        <name>Spring Snapshots</name>
        <url>https://repo.spring.io/snapshot</url>
        <releases>
            <enabled>false</enabled>
        </releases>
    </repository>
</repositories>

Next, add the necessary dependencies to your project. Spring Boot provides various starters, including ones for Ollama and OpenAI:

<dependency>
       <groupId>org.springframework.ai</groupId> 
       <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>
 <dependency> 
         <groupId>org.springframework.ai</groupId> 
         <artifactId>spring-ai-ollama-spring-boot-starter</artifactId> 
</dependency> 
<dependency> 
        <groupId>org.springframework.ai</groupId> 
        <artifactId>spring-ai-core</artifactId> 
</dependency> 
<dependency> 
        <groupId>org.springframework.ai</groupId> 
       <artifactId>spring-ai-retry</artifactId> 
</dependency> 
<dependency> 
       <groupId>org.springframework.ai</groupId> 
       <artifactId>spring-ai-tika-document-reader</artifactId> 
</dependency>

The tika dependency, provided by Spring AI, facilitates the handling of document uploads and chunking.

Implementation Details

For each LLM provider, I created a dedicated service. The user selects the desired LLM, and the REST controller routes the request to the appropriate service.

Below is an example of an Ollama service:

领英推荐

What Are the Benefits of Using Python GUI Development?

ParallelStaff 7 个月前

Top 10 Python Frameworks Rising High in Popularity

Gopi Raghavendra 2 年前

Python Vs. Java: A Comprehensive Comparison To Not…

Webs Optimization Software Solution 6 个月前

@Service
public class TestOllamaService {
    @Autowired
    private OllamaChatModel chatModel;
...
    public WebResponse sendMessage(String message, String model) {
        ChatResponse response = chatModel.call(
              new Prompt(message,
                 OllamaOptions.create().
                          withModel(model).
                               withTemperature(0.4F)
               )
       );
        return new WebResponse(response.getResult().getOutput().getContent());
    }

Retrieval-Augmented Generation (RAG) Implementation

For RAG, I created a service to handle vector database and document operations. After the document is uploaded and stored locally.

I injected the embedding model (using Ollama embedding) and instantiated an in-memory vector database (part of spring libraries, but you can use other Vector DB):

@Autowired 
public VectorDBService(@Qualifier("ollamaEmbeddingModel") EmbeddingModel embeddingModel) { 
        this.embeddingModel = embeddingModel; 
        this.vectorStore = new SimpleVectorStore(this.embeddingModel); 
}

I also implemented functions to split documents into chunks, add them to the vector database, and perform similarity searches on it:

public List<Document> splitDocument(Path destinationFile) {
        TikaDocumentReader documentReader = new
                                TikaDocumentReader(destinationFile.toUri().toString());
          List<Document> documents = documentReader.get();
          return new TokenTextSplitter().apply(documents);
}

public void addDocumentsToVectorDB(List<Document> splitDocuments) {
    vectorStore.add(splitDocuments);
}

List<Document> similaritySearch(SearchRequest request) {
    return vectorStore.similaritySearch(request);
}

The documents related to the user's query are retrieved from the vector database and concatenated into a single context, which is then passed to the LLM as part of the context:

String promptTemplate = """
    Answer the question based only on the following context:
    {context}
    ---
    Answer the question based on the above context: {question}
""";


  public Flux<ChatResponse> getStreamRAGResponse(String message, String model) {

       List<Document> docs=
                      VectorDBService.similaritySearch(SearchRequest.query(message));

       //lets concat all documents content to use as context
       StringBuilder sb=new StringBuilder();
       for (Document doc: docs) {
           sb.append(doc.getContent()).append('\n');
       }

       //will use the PromptTemplate object to replace {variables} in the prompt
       PromptTemplate promptTemplate = 
                          new PromptTemplate(PROMPT_TEMPLATE);
       Message fullMessage=
         promptTemplate.createMessage(Map.of("context", sb.toString(),
                                                                               "question", message));
        return chatModel.stream(
                new Prompt(
                        fullMessage,
                        OllamaOptions.create()
                                .withModel(model)
                                .withTemperature(0.4F)
                ));
    }

Function Calling Implementation

Function calling in Spring AI required a slightly more complex setup than in Python. Functions are defined as Spring Beans. Here is an example:

public class DBService implements Function<DBService.Request, DBService.Response> {
     public record Request(String table, String column) {}
     public record Response(Map<String, String> employees) {}
    
    @Override
    public Response apply(Request request) {

      // mock DB query response
        return new Response(Map.of("David", "Architect", "John", "Developer"));
    }
}

This class is then registered as a Bean:

@Configuration
public class ToolsConfig {
    @Bean
    public Function<DBService.Request, DBService.Response> getEmployees() {
        return new DBService();
    }
}

Finally, the function is added to the LLM call options:, as you can see you can add a list of functions the LLM can choose from based on the end user prompt

public ChatOptions buildChatOptionsWithTool(String model) {
    return OpenAiChatOptions.builder()
            .withModel(model)
            .withTemperature(0.4F)
            .withFunctionCallbacks(List.of(
                    FunctionCallbackWrapper.builder(new DBService())
                    .withName("getEmployees")
                    .withDescription("Get employees from DB")
                    .build()))
            .build();
}

Conclusion

Spring AI makes developing AI applications in Java straightforward. The documentation is comprehensive, and the examples provided by Spring AI are helpful. While Python may be more dominant in the AI space, Java remains a powerful and viable option for AI development, especially in enterprise environments with existing Java infrastructures.

要查看或添加评论，请登录

Shlomo Goldshtein的更多文章

LLM Distillation: Making Language Models Smaller, Faster, and More Efficient

2025年3月3日

LLM Distillation: Making Language Models Smaller, Faster, and More Efficient

In the rapidly evolving landscape of large language models (LLMs), the push for more powerful models has led to an…
From DevOps to LLMOps: The Evolution of "Ops" Methodologies

2025年2月18日

From DevOps to LLMOps: The Evolution of "Ops" Methodologies

Introduction In today’s tech landscape, the term "Ops" appears in nearly every discussion on software development and…

1 条评论
Data Mesh Architecture: From Data Lakes to Domain-Oriented Data Products

2025年2月5日

Data Mesh Architecture: From Data Lakes to Domain-Oriented Data Products

Introduction Throughout my career as an engineering manager and software architect, I have designed and implemented…

2 条评论
The Performance Triad: KPIs, KPOs, and SLAs

2025年1月14日

The Performance Triad: KPIs, KPOs, and SLAs

Introduction In today’s highly competitive and data-driven environment, businesses and technical teams rely on clear…
Living on the Edge

2024年12月26日

Living on the Edge

Edge Computing in a Nutshell Preface The first time I encountered the need for edge computing was about a decade ago…
Quantum Resistance: Future-Proofing Software Systems

2024年12月18日

Quantum Resistance: Future-Proofing Software Systems

Preparing for the Quantum Era Not long ago, quantum computing was a concept confined to the realm of science fiction…
Breaking the Monolith: A Practical Guide to Microservices

2024年12月9日

Breaking the Monolith: A Practical Guide to Microservices

Over the past years, as an architect, I’ve designed numerous microservices-based systems. While building new systems…
Agentic AI from Architect perspective

2024年9月22日

Agentic AI from Architect perspective

In recent months, I've delved deeply into the subject of AI agents. There's been an undeniable buzz around this…
Prompt Engineering: the 6 components of prompt engineering

2024年7月16日

Prompt Engineering: the 6 components of prompt engineering

With the rapid adoption of Generative AI across various fields, the term "prompt engineering" has emerged. Some may…
Microservices as case study for architectural design

2022年1月20日

Microservices as case study for architectural design

Preface Over 8 years ago, I began working with the microservices architecture pattern. Today, it has become one of the…

1 条评论

See all articles

AI and Java with Spring Boot

Shlomo Goldshtein

Chief Software Architect | R&D Executive | Cloud & AI Strategy | Microservices & CI/CD | Digital Transformation

Overview of My POC

Key Technologies

Setting Up the Project

Implementation Details

领英推荐

Retrieval-Augmented Generation (RAG) Implementation

Function Calling Implementation

Conclusion

Shlomo Goldshtein的更多文章

社区洞察

其他会员也浏览了

Top 10 Python Frameworks Rising High in Popularity

Choose Python Web Development for Your Upcoming Project [10 Reason]

Can you run Python online compilers?- Python online

Why do people still use Java and C++ when there is Python?

Java or Python in AI?

Python vs. Java: The Ultimate Showdown (Or Not)

What is Python?

Deciphering the Battle: Java vs Python

Decrypting the Python vs Java: Which Programming Language is Better?

Overview of My POC

Key Technologies

Setting Up the Project

Implementation Details

领英推荐

Retrieval-Augmented Generation (RAG) Implementation

Function Calling Implementation

Conclusion

Shlomo Goldshtein的更多文章

LLM Distillation: Making Language Models Smaller, Faster, and More Efficient

From DevOps to LLMOps: The Evolution of "Ops" Methodologies

Data Mesh Architecture: From Data Lakes to Domain-Oriented Data Products

The Performance Triad: KPIs, KPOs, and SLAs

Living on the Edge

Quantum Resistance: Future-Proofing Software Systems

Breaking the Monolith: A Practical Guide to Microservices

Agentic AI from Architect perspective

Prompt Engineering: the 6 components of prompt engineering

Microservices as case study for architectural design

社区洞察

其他会员也浏览了

Top 10 Python Frameworks Rising High in Popularity

Choose Python Web Development for Your Upcoming Project [10 Reason]

Can you run Python online compilers?- Python online

Why do people still use Java and C++ when there is Python?

Java or Python in AI?

Python vs. Java: The Ultimate Showdown (Or Not)

What is Python?

Deciphering the Battle: Java vs Python

Decrypting the Python vs Java: Which Programming Language is Better?