Architecture-Safe GenAI
Myles Suer
Serving CIOs driving agile transformational businesses. Emeritus #CIOChat Facilitator. IDG Contributor. #1 CIO Influencer. Top 100 Digital Influencer. Research Director at Dresner Advisory Services.
In 1998, I raised venture capital and started a new company called eBalance. It was an amazing and heady time. Seemingly for the first time, tech was cool and even being talked about on mainstream media like CNBC. Now, we have ChatGPT…which is very cool. Adding to the tech excitement, Nvidia announced recent sales had jumped 170%, largely driven by demand for AI chips.
AI adoption continues accelerating
However, few companies are fully prepared for the widespread use of GenAI or the business risks these tools can bring
Without question, organizations need the ability to manage, monitor, and control these models and their usage. An effective solution will deploy data security governance
The authors of the book “Rewired” suggest there needs to be “clear standards and thresholds for AI risk including transparency and explainability, automated AI model monitoring systems
Defining the Architecture
Given this, what should an architecture for safe GenAI look like? I want to suggest that it will have the the following components and system interactions:
Let’s take a look at the pieces and parts and how they should work together, starting at the bottom. Data is input into the system by a data pipeline. The pipeline’s job is to make clean, consistent, and secure data available to the large language model (LLM). Here, an AI Security Governance Layer is used to scan training data before it is ingested by the LLM from the data pipeline. Depending on the business objectives of the LLM model, the system can protect against model bias by preventing the model from ingesting sensitive or inappropriate data. Or if it is determined there is a legitimate reason to include sensitive data in the training, then that model needs to be tagged appropriately.
When this is applied, it is accomplished by discovering data that can lead to bias, preventing this data from being used during training by the LLM model. This kind of protection can ensure conformance with corporate goals such as meeting the fair credit and reporting act or hiring a diverse team. Here, data that can lead to model bias is eliminated from training. To work, sensitive or inappropriate data should be masked.
领英推荐
Next, embeddings are created. Embeddings are data representations that carry semantic information critical to LLMs understanding and maintaining long-term memory. Embeddings are generated by LLMs. These features represent the different dimensions of the data that are essential for understanding patterns, relationships, and underlying structures.
Embeddings are stored in a vector database. The vector database is responsible for efficiently storing, comparing, and retrieving billions of embeddings (i.e. vectors). A vector database uses a combination of algorithms. The algorithms are assembled into a pipeline that provides fast and accurate retrieval of the neighbors of a queried vector. Here, the data security governance layer should scan embeddings as they are created and maintained in the vector database or flowing out of the vector database. The goal here for the data governance layer is to protect sensitive data before it is queried by an internal or external user.
At Prompt Query and Execution, end user queries are created. The chat agent is connected via an API, browser, Slack, or the GenAI user interface. The governance API should be inserted and executed in that same library to ensure it is evaluated contextually prior to the request going to the LLM. Then, the response from the model should go via the same governance API before the result is returned to the user. Connections to the LLM can also be made through orchestration frameworks like LangChain and Sagemaker.
The chat agent operates by constructing a series of prompts to submit to the language model. A compiled prompt typically combines a prompt template hard-coded by the developer. Before this happens, information shared with the data governance layer ensures the user has rights to the information that is being requested and is appropriately logged in. The security governance layer needs to evaluate the prompt in the context of the entire conversation and the policy rules, enforcing the required controls dynamically in real time. Here, the data governance layer has the context-aware intelligence to detect queries that can result in inappropriate structured or unstructured data from being shared. Once a prompt is determined to be legitimate, the prompt is forwarded to the chat agent. Where the user does not have access to an element of data, the data governance layer should mask or not allow the sharing of the information. Model responses also need to be governed and redacted or blocked if it is determined the user should not see the result. For example, the prompt “What is kimberly's phone” might go through to the LLM, but the response needs to redact the phone number supplied by the LLM.
Conclusion
This is the time to move on GenAI. The impact to customer experience and other business functions
Andreas Welsch Michal Harezlak
Thanks for sharing. Will unpack