Wardley Map for Language Learning Model (LLM) Architectures

Wardley Map for Language Learning Model (LLM) Architectures

Language Learning Models: A Brief Overview

LLMs such as ChatGPT bring significant value to businesses. With interpretation and human-like text generation, LLMs catalyze a fundamental shift in business operations. They unlock new levels of efficiency and intelligence across a broad spectrum of digital interfaces and data analysis platforms. They also improve information retrieval systems like search engines and offer sophisticated text analysis.

All of these improvements ultimately drive efficiency and customer satisfaction. However, it is not easy to define the architecture, components, and interactions between them. There is a rush to establish a solid LLM architecture and start development in the current emerging technology landscape. Let's see how Wardley Maps can help.

Mapping the LLM Architecture

As a consultant using Wardley Maps, I view them as techniques for visually illustrating the hierarchy and evolution of services within a system. These maps help me understand the landscape and identify strategic opportunities.

No alt text provided for this image
Wardley Map for LLM architecture for extractive question answering (right click and open in a new tab to see full size)

For those unfamiliar with Wardley Maps, here is how they work. Wardley maps help us organize and understand the evolution of a business or technology landscape (LLM architecture in this case). Arrange components based on their maturity and importance. The horizontal axis is time, and the vertical axis represents the value chain (the higher - the more visible value to the user).

Wardley maps can help identify areas for improvement, spot opportunities, and make informed decisions about resource allocation, which is precisely what I did with the LLM example. The map clearly overviews the generative AI landscape and hints at what to do next.

Users and Information

In this specific scenario, the business challenge is to process user-generated Questions to yield accurate Answers with a broad context (see example on the map). The goal is to produce precise, contextually accurate answers.

Question Context with NLP

This component processes questions with Natural Language Processing (NLP) techniques to prepare for search in Vector DB.

Embeddings and APIs

Next, we have Embeddings, dense vector representations of text that capture semantic meaning. The ChatGPT Embedding API can transform data sources into these embeddings. We avoided running custom transformers, what would be too low in the value chain.

Knowledge Base and External Sources

The Internal Knowledge Base has additional information about the entities in our domain. We can also leverage External/Open Data Sources to augment our context or validate answers.

Vector DB

A Vector DB stores precomputed embeddings. It is a service component — systems talk to it over API to handle vector representation and search similar ones.

Information Flow and Inertia

No alt text provided for this image
The flow of context-enriched and raw data.(right click and open in a new tab to see full size)

The flow of context-aware information in this architecture is marked with yellow color. It begins with the user's question and goes through context extraction and embedding transformation. Then information flows back: from the Vector database, to context extraction and back to prompt composition.

Here's a fascinating view. Unlike traditional architectures, the flow of "raw" information marked in green does not reach users directly. The system uses it only for indexing and context extraction. Users get interpreted, summarized, and evaluated answers based on raw data enriched.

The links to sources is still present for reference.

Internal Database and DB Engine

Finally, the Internal Database stores proprietary and domain-specific information, serving as a critical source for custom data. However, its design and maintenance introduce potential inertia. Working with the DB Engine, this component stores raw data, which embeddings use to index and store in Vector DB.

Let's talk about where organizations will find inertia in this system. In Wardley maps, inertia is often depicted as a force that acts against the movement and evolution of components on the map. It can hinder innovation and prevent organizations from adapting to new market conditions or technological advancements.

Any guesses?

The highest inertia is typically found in components like the Internal Databases, which may need custom design and maintenance to fit the application's specific needs. On the other hand, components like the ChatGPT Embedding API and various data sources represent industrialized components - widely available resources with little to no customization or maintenance needed.

Fine-tuning

I have yet to add components for fine-tuning to keep the map simple.

Let's talk about it, though. The concept of model fine-tuning needs to fit better into the LLM architecture. Consider this: actual data, especially private and customer-specific, must not be present in the training dataset. The model can remember and reveal it to any user that asks questions.?

Industrialized components of large foundational LLM services like OpenAI allow fine-tuning as a service on domain data. ChatGPT has special API for fine-tuning, the cli to prepare and upload data, and it can deploy multiple versions of fine-tuned models.

However, even if it is an easy process, the whole point of industrializing components is benefiting from their standardized, pre-built functionality. When we engage in model fine-tuning, we risk losing these benefits as we customize and introduce complexity into these components.

Consequently, consuming these industrial components as services is generally more efficient than spending resources to fine-tune them. This does not mean fine-tuning is never a good choice, but rather that its trade-offs should be carefully considered.

Conclusion

Mapping an LLM architecture with a Wardley Map illuminates the interaction of custom and industrialized components. It also helps identify areas of inertia that might slow down change and innovation. Understanding and planning for these elements is crucial for creating efficient, scalable systems as we continue to build more complex and powerful language models.

As we look to the future of Language Learning Models (LLMs), the question isn't much about what these technologies can do (pretty much everything involving a sequence of bytes). The question is rather how they may redefine how businesses handle knowledge.

The promise of LLMs lies in their capacity to turn raw data into high-quality, actionable insights. Ok, why does this matter? It matters because insights, not raw data, drive decision-making. The evolution of LLMs could liberate organizations from the often resource-intensive task of managing raw data and allow them to focus on the insights these advanced models provide. The conventional wisdom says that organizations must maintain as much data as possible to train and leverage that data in LLMs. They do not have to.

Why might businesses reduce reliance on traditional internal databases?

As LLMs continue to improve, it will become apparent how much they outperform conventional systems in extracting meaningful knowledge from data. Consequently, companies will seek more streamlined data storage and maintenance solutions, freeing resources for strategic action rather than maintenance.

When companies evaluate if server maintenance makes sense from driving business goals - and usually, the answer is no - they happily migrate it off to the cloud. The same will happen to "raw" data.

However, the success of this transition hinges on several factors. The effectiveness and reliability of LLMs (they are improving very slowly despite all the "game-changer" announcements). And the inertia of the components. The legacy way of maintaining massive data sources to produce a simple representation (views) and, recently ML models, which need an army of data scientists to answer simple questions, is over. Pretty soon, we will realize that the economy of questions has begun, and the economy of generating answers is almost over.

In conclusion, the 'why' of integrating LLMs into business architectures is compelling. The ability of LLMs to generate actionable insights is unprecedented. Companies should use it for efficient, future-ready systems that drive strategic decision-making.

Link to the source of the map: https://onlinewardleymaps.com/#KH4YDEtpwSFPTpdvvv

Thiyagarajan Maruthavanan (Rajan)

Human monopoly on intelligence will expire soon

1 年

Can you check the link ? When I open it it showing a different Wardley map. What I am seeing is titled Automatic Vulnerability Prioritisation. It is different from the screenshot shared

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了