Implementing AI and LLMs to Revolutionize Mortgage Technology

Implementing AI and LLMs to Revolutionize Mortgage Technology

Author: Chris Wang | Data Scientist at Zeitro

Understanding AI and LLMs

Artificial Intelligence has made significant strides, with AI models now capable of understanding and generating human-like language. Large Language Models (LLMs), such as OpenAI's GPT-4o or recent O1-preview models, are trained on vast amounts of text data, enabling them to answer questions, generate content, and even engage in meaningful conversations.

The Challenge: AI Hallucinations

Despite their impressive capabilities, LLMs have a known limitation—they can sometimes produce responses that sound plausible but are factually incorrect or nonsensical. This phenomenon is known as "hallucination." It's like asking an expert for advice, and they confidently provide an answer that's completely off-base.

Discover the Power of RAG: Your AI Assistant for Complex Documents

Have you ever found yourself lost in a sea of information, trying to find that one crucial piece of data buried within a lengthy document? Imagine having a smart assistant that not only understands your questions but also dives into vast amounts of text to fetch the most relevant information for you. Welcome to the world of Retrieval-Augmented Generation (RAG)!


How RAG Enhances AI Accuracy

Bridging the Gap with RAG and AI LLM (like ChatGPT)

Retrieval Augmented Generation (RAG) addresses this challenge by enhancing AI's responses with real-world data. It works by retrieving relevant information from a knowledge base and using it to ground the AI's answers, ensuring they are both accurate and contextually relevant. Think of RAG as equipping our AI with a reliable reference library, reducing the chances of hallucinations and increasing the quality of its responses.

Revolutionize Mortgage Business with RAG: Your AI-Powered Information Navigator

Imagine navigating the complex world of mortgage or real estate with ease, where every piece of information you need is just a question away. Whether you're a mortgage professional, a real estate agent, or an investor, the power of Retrieval-Augmented Generation (RAG) is here to transform how you access and utilize information.


Transforming Mortgage Technology with RAG

At Zeitro, we're leveraging RAG to revolutionize how professionals interact with complex industry documents. One of the most significant challenges in the mortgage sector is dealing with intricate PDFs like the Fannie Mae Product Guidelines. These documents are extensive, dense with information, and constantly updated, making it difficult for loan officers to find the specific details they need promptly.

Empowering Loan Officers with Instant Access

Using RAG, Zeitro transforms these complex PDFs into an accessible knowledge base. Loan officers can now ask specific questions related to Fannie Mae guidelines and receive precise, contextually relevant answers in real-time. This means no more flipping through hundreds of pages or searching for the latest updates—the information is at their fingertips.

Streamlining Client Consultations

With instant access to accurate information, loan officers can provide clients with immediate answers to their queries. Whether it's about eligibility criteria, interest rates, or documentation requirements, RAG enables them to deliver efficient and confident service, enhancing client trust and satisfaction.


A Step-by-Step Journey Through RAG by Zeitro

1. Document Preparation

PDF Reading: Our journey begins by transforming complex PDF documents into readable text. Each page is meticulously converted, ensuring no detail is left behind. This process is akin to translating a foreign language into one we can all understand.

Zeitro's Application: We process extensive documents like the Fannie Mae Product Guidelines, turning them into machine-readable text that our AI can work with effectively.

Text Splitting: To make the information more digestible, the text is divided into smaller, bite-sized chunks. Imagine organizing a massive library by breaking it down into sections and chapters, making it easier to find exactly what you're looking for.

Zeitro's Application: We segment the guidelines into logical sections, such as loan types, credit requirements, and property eligibility, so that specific information can be retrieved quickly.

2. Metadata Creation

Chunk Metadata: Each text chunk is tagged with metadata—like page numbers and content summaries. It's similar to adding bookmarks or sticky notes in a book, allowing us to pinpoint precisely where specific information is located.

Zeitro's Application: Our system tags each chunk with relevant metadata, including pages, quote locations and position coordinates, ensuring the most current information is always accessible.

3. Embedding and Storage

Embedding Function: We convert each text chunk into a unique numerical fingerprint using advanced embedding techniques. This allows our AI to understand the essence of the text, much like recognizing the unique features of a familiar face.

Chroma Database: These embeddings are stored in the Chroma database, a powerhouse designed to handle and organize vast amounts of information efficiently. It's our AI's version of a well-organized digital library.

4. Query Processing

User Query: When you ask a question, our system springs into action. It searches the Chroma database for the most relevant text chunks, much like having a personal librarian who knows exactly where to find the answers you seek.

Contextual Retrieval: The retrieved text provides the context our AI needs to craft a precise and relevant response. It's the difference between getting a generic answer and one that's tailored just for you.

AI Model Interaction: With the context in hand, our AI model generates a response that is not only accurate but also insightful. It's like having a conversation with an expert who has read every page of the document.

Zeitro's Application: The AI not only provides the down payment percentages but also includes any exceptions or additional conditions that might apply, ensuring the loan officer has a comprehensive understanding.


Our Journey with GuidelineGPT: Transforming Mortgage Technology with AI

Over the past five months, I've been on an incredible journey working on GuidelineGPT, a project that's all about revolutionizing the mortgage industry using Artificial Intelligence and Large Language Models (LLMs). This experience hasn't just deepened my understanding of AI technologies like Retrieval-Augmented Generation (RAG); it's also given me a front-row seat to how these innovations can streamline complex processes in the mortgage sector.

June 2024: Diving into Legacy Code and Cutting Response Times

In June, I found myself knee-deep in the legacy code that interfaced with the OpenAI API. The initial response times were a staggering three minutes—definitely not user-friendly. Through meticulous code optimization and better management of vector stores, I managed to bring the response time down to less than 30 seconds. This not only made things faster for users but also significantly cut down on operational costs.

Thoughts: Wrangling with how the LLM provider structures data was both challenging and enlightening. Dealing with poorly documented legacy code was tough—I remember feeling pretty frustrated at times. But it really drove home the importance of maintaining clean and well-documented codebases. It's something that makes future scalability and maintenance so much easier.

July 2024: Implementing Streaming Mode and Enhancing User Interaction

Building on that progress, July was all about refining the user experience. I implemented OpenAI's streaming mode, which slashed the response time down to about 10 seconds. That real-time interaction was a game-changer for user engagement. I also focused on recording how users interacted with the AI on the backend and cleaned up the database for better data management.

Thoughts: Seeing the streaming mode in action was fantastic—it made interactions feel so much more natural and instantaneous. Cleaning up the database wasn't the most glamorous task, but it provided valuable insights into user engagement and how features were being utilized. That information really helped guide future enhancements.

August 2024: Expanding Guidelines and Exploring RAG with Claude API

In August, I expanded the AI's knowledge base by incorporating more guidelines like FHA, VA, and Freddie Mac. I switched over to the Claude API because it offered better control, faster response times, a stable connection, and it was easier to fine-tune. The type-writing effect can start in less than 2 seconds! How impressive. This was also when I dove deep into the RAG method, integrating it seamlessly with Claude to enhance answer accuracy.

Thoughts: Having a stable connection made the AI feel so much more trustworthy. Integrating RAG with embedding models was a fascinating journey. It was incredible to see how transformer architectures could enhance contextual understanding, making the AI's responses more precise and reliable. It felt like we were unlocking new potential.

September 2024: Enhancing Accuracy with Sentence Embeddings and OCR Processing

September was all about precision. I upgraded the RAG framework to use the sentence embedding model all-MiniLM-L6-v2, which significantly improved accuracy. I enabled page location highlighting for searched relevant contexts and processed lengthy documents using AWS OCR models for precise location referencing.

Thoughts: Working with embedding models powered by transformers really paid off. I discovered that incorporating few-shot learning instructions actually outperformed zero-shot approaches, which was pretty exciting. Highlighting relevant contexts not only showed users how the AI was "thinking" but also helped them understand the responses better. Plus, it made debugging on our end a lot easier.

October 2024: Fine-Tuning with Agentic Actions and Knowledge Distillation

In October, I introduced agentic actions for information retrieval within the RAG database, like empowering the AI to choose which parts should be used for context, tuning the AI to generate questions more relevant to the mortgage context. I also implemented knowledge distillation with relevance thresholds to filter out irrelevant content, ensuring the AI provided only the most pertinent information.

Thoughts: Satisfying the users' needs first became more important than ever. By fine-tuning the model's responses and enriching its knowledge base, I felt like we were really enhancing the AI's ability to serve customers effectively. It was fulfilling to see the AI becoming more aligned with what users actually wanted and needed.

November 2024: Implementing Version Control and Exploring New Frontiers (On-going)

As we moved into November, I added guideline version control features so users could see how guidelines have evolved over time. I adjusted the dynamic window size for context retrieval to match the language systems of individual files. One of the most exciting developments was starting to implement live voice chat about the guidelines.

Thoughts: Continuously adding features to refine the product has been both challenging and rewarding. Putting myself in the customer's shoes helped me design solutions that truly meet their needs. I'm increasingly excited about the potential applications of this technology, like chatbots for borrowers. I really believe it can make a big difference in improving the mortgage industry.

Looking Ahead

As I continue to develop GuidelineGPT, I'm excited about the future possibilities. The idea of integrating live voice chat and expanding the AI's knowledge base opens up new avenues for making the mortgage process more accessible and efficient. I'm also considering how these innovations can be applied to other areas, such as creating chatbots for borrowers to simplify their mortgage journey.

Final Thoughts

Working on GuidelineGPT has been more than just a project; it's been a transformative experience. It reinforced my belief in the power of technology to simplify complexities and improve people's lives. I look forward to continuing this journey, learning more, and contributing to the evolution of the mortgage industry through AI.


Why RAG is Your Go-To Solution

Enhanced Accuracy

Accurate information is crucial in the mortgage industry to ensure compliance with federal regulations and investor requirements. By retrieving the most relevant information from trusted sources, our AI ensures that you get precise answers every time. This minimizes the risk of hallucinations and increases confidence in the responses provided. RAG helps minimize errors that can lead to compliance issues, thereby reducing risk for both the company and its clients.

Contextual Understanding

Mortgage guidelines are notoriously complex and ever-changing. By implementing RAG, Zeitro simplifies these documents, making them accessible and understandable. Loan officers can stay informed about the latest updates without the hassle of manual research. With a deep understanding of the context, the AI provides responses that are not just correct but also meaningful and relevant to your specific query.

Scalability

Mortgage guidelines are notoriously complex and ever-changing. By implementing RAG, Zeitro simplifies these documents, making them accessible and understandable. Loan officers can stay informed about the latest updates without the hassle of manual research. Whether you're dealing with a single document or an entire library, our system scales effortlessly to meet your needs. It's designed to handle large volumes of data without compromising on speed or accuracy.

Improved User Experience

RAG transforms information retrieval into an engaging experience. No more wading through pages of text; instead, you get direct, concise answers, making your interaction with information both efficient and enjoyable.

Boosting Productivity and Efficiency

With quick access to information, loan officers spend less time searching for answers and more time assisting clients. This efficiency leads to increased productivity and the ability to handle more cases effectively.


Conclusion

My journey with GuidelineGPT has been an exhilarating blend of technical challenges and meaningful impact. By harnessing the power of AI and RAG, we've created a tool that not only simplifies the complexities of the mortgage industry but also sets the stage for future innovations. I'm eager to continue pushing the boundaries of what's possible, using technology to make the mortgage process more transparent, efficient, and accessible for all.


Tools to Use and References:

https://www.trychroma.com/ https://docs.anthropic.com/en/api/messages https://platform.openai.com/ https://python.langchain.com/docs/tutorials/llm_chain/ https://acl2023-retrieval-lm.github.io/





Igor Plotnikov

VP of Engineering, co-founder at Dymium. Redefining data security.

2 天前

Awesome! It's great to see the vision becoming a reality.

要查看或添加评论,请登录