Generic AI Agents - Your next Security SPOF?
Jari Pesonen
Head of Cyber Protection & Digital Asset Security | IT/OT/ICS Security | SABSA, AWS Security
It could be that one practical break-through AI implementation in organisations will be Generic AI Agents[1] (AI Agent is a program/agent acting in intelligent manner, interacting with environment). Generic AI Agents are step further from the current spot/use case implementations and will perform much wider number of functions and access wider sets of data. Already, a huge number of spot solutions have been developed[2], where departments and teams can solve their particular issues with a 3rd party SaaS service or one integrated to existing tools, like Microsoft Copilot. As an example, take a look at Artisan (https://artisan.co). Microsoft’s Azure OpenAI Service & Semantic Kernel[3] is one of the options to develop your own solutions and some companies are already using[4] it for the purposes mentioned later. Also, from GPT Store[5] you can fine many AI Agent solutions.
[Note: this is not endorsement of Microsoft, OpenAI or Artisan products, these are used here as example of actual existing product with existing implementations.]
Your environments will have AI Agents, and we will benefit a lot from these. It’s good to be pro-active.
Basic questions and tasks which are currently done by your colleagues, may be done by Agents in the future, such as;
The NextGen challenge comes when spot Agent solutions are replaced with Generic AI Agents.
Use of Generic AI Agents may solve some of the issues organisations face with data siloes and employees spending a lot of time in isolated systems. Spot solutions are easy way to start, but employees do not want to use many tools - the future convergence might be in a single UI with multiple Agents in background, or Generic Agents, which integrate with many systems. Either way, these may become the next integration platform between backend systems and be the centralised place to perform actions and ask questions about ‘anything’. Obviously, we all know that in practise this is not easy.
The typical issues which hamper the implementations are;
My prediction is that in the future organisations will be stricter than ever before on purchasing only systems where data is exportable and APIs are fully supported. Existing fully functional systems will be replaced solely because of lack of this support for APIs and data exportability. Data lakes will play even greater role, as agents could access data lake for questions about financial performance, customers, employees, basically any data that company has. Organisations will want to dump literally everything they can to data lake in the future and make it available to employees. Before generic AI Agents become feasible, many spot solutions will be implemented, and several of the current security issues with Agents need to be solved - such as RBAC controls to data access.
Security and compliance wise, generic AI Agents will lead to some challenges, as we will create a single point of failure, and potentially a very complex black-box system. Even more risky system than the existing integration platforms, as these systems also may suffer from the same security issues than ChatBots PLUS could actually perform harmful actions, such as give access to data for unauthorised persons, or make changes to systems or make payments, for example. To lesser extent, these same issues will come with spot solutions.
Security controls in these Agents are not yet mature.
You probably have seen numerous news articles about bypass of the OpenAI’s ChatGPT security [6][7]. Researchers created amusement through a ChatBot trained to jailbreak other ChatBots[8]. Developers continue to play catch-up and patch holes while new vulnerabilities and bypasses will be discovered.
In general, potential vulnerabilities include:
Securing these systems will be as important as securing data lakes, integration platforms and similar systems. Total risk for this kind of system will be combined risk from all connected systems, all data accessed, all actions performed - which means it may become one of the highest risk systems in the organisation. And frameworks like ISO42001 are very new and lacking in detailed controls. I recommend following OWASP AI project[9] for future guidance.
What should be considered when securing AI Agents
As Agents will have access to confidential data, and integrate to many critical systems, least privilege access becomes increasingly important. Access permissions need to be carefully considered and typical issues with over permissive technical integration accounts should be avoided. The Agent could be used to bypass traditional access controls as it is harder to secure. For example, employee who does not have access to all employee compensation data, might ask from Agent: “How much was paid to X X this month”, while employee might be authorised to only ask about their own salary payments. Consider another example: an Agent which integrates with your email and calendar and finds free slots, organises your time etc - such integrations typically require full access to everyone’s emails and calendars. Would you allow some small 3rd party vendor that level of access?
Auditability plays an important role in Agents. All prompts, actions and answers should be recorded to enable auditability. Note that, if personal data, or other highly confidential, or regulated data is accessed/processed, there could be limitations on how audit trails are retained and/or accessed.
People must remain accountable. If the Agent gives wrong answers for questions about financial performance or makes inaccurate future predictions or calculations, can you find out why this happened? Who would be responsible?
Environments should be continuously monitored to detect system abuse and bypass attempts. Note that Service Providers, such as Microsoft, and OpenAI, will also monitor these by default. “Microsoft personnel analyze prompts, completions and images for harmful content and for patterns suggesting the use of the service in a manner that violates the Code of Conduct or other applicable product terms.” Service providers may allow exemptions on monitoring [10]. Consider what data is being processed and if you want Service Provider to have access to it.
There are open legal questions about the foundation models and the initial training data. For example, NY Times case may lead the models to not be available anymore[11]. If confidential or copyrighted data is used to fine-tune model, it’s not possible to later delete specific data without deleting the model.
Data stored and processed in the system should be secured, with at-rest and in-transit encryption. Data localisation requirements should be considered, depending on the data being processed. Data classification of the source/processed data must match with the classification of the system and controls in place. E.g. CASB, localisation of data.
领英推荐
Agent runtime environments should be secured as it will have access and permissions to potentially critical data and systems, security level should match this risk. Credentials and secrets used by the agent for integrations and data access should be secured (e.g. in HSMs). End-points and platform should have security testing performed regularly and after changes.
Security of the model/agent and security controls implemented within the Agent should be verifiable. The verifications should be repeatable/automated so that all changes to the model and Agent can be tested. Otherwise, there might be new vulnerabilities which can be exploited when new version is released. As Agents are usually ‘black box’, comprehensive testing can be difficult and current tools are lacking and may need to be developed per implementation. How you would verify that HR Bot does not allow access to another employee’s data?
SaaS Agent providers have their own considerations you should take into account. Do you know how the model is trained, with what data, is it trained on copyrighted material, what quality of data, and how its security is tested? Is it tested against typical LLM issues, like biases? What kind of QA processes the service provider uses to ensure quality? Do you know what changes are done and when to the model? Do you know how permission model is implemented for RAG/integrations? Typically, SaaS services are ‘black box’ and lack in transparency and the continuity of small, 3rd party companies is also a high risk as new services come up like mushrooms after rain, and not all will survive. And we do not yet have auditors or standards to audit against to get some assurance.
In addition to these AI Agent specific considerations, normal SaaS/3rd party security requirements should apply, such as BCP/DR, physical security, defining shared responsibility model, incident response and notification, change management, certification/compliance requirements etc.
Data access and permission models seem to be a big pain point right now.
As far as I can see, ChatBots and, for example, Microsoft’s Azure Document Intelligence do not offer ways to restrict access to the data. Once data is processed/indexed, it’s available to all users. Microsoft will (soon) release support for the RBAC model in Vector database (Document Intelligence), which may mitigate some of issues. While role base access control on Azure AI services in general is quite limited [12].
Models are not aware of the User or User’s permissions.
Which means that if you train or fine-tune your own model with confidential data, it’s available to all users. For RAG (Retrieval Augmented Generation), permission model would need to be developed per integration/data source, creating potentially complex, custom authorisation implementations. For actions performed by Agents, these probably also require custom permission model implementations, depending heavily on the type of integration, and if the agent performs the actions using user’s permissions or not. Actions could in some cases have additional human level review before being executed.
There are several limitations on security controls on Agents. For example, Azure Document Intelligence DLP offering is limited to ‘configure list of outbound URLs OpenAI resources are allowed to access’ [13]. Obviously, as there are severe concerns on ChatBots to leak data outside of the environment.
Jari’s top recommendations
Implement a clear AI Usage Policy[14] to guide and to set the governance model. These solutions will come, to support business processes. Train users on what is allowed and what is not and how they should engage with privacy, security and compliance teams to get support.
Be wary of spot solutions/pilots which may be introduced as Shadow-IT, and will soon integrate to your critical business processes and process critical data. As normal, something which starts as Pilot, is soon to be your new business critical system.
When considering implementing AI Agents, start with threat modelling and risk assessments. Understand what data is processed, review dataflow diagrams, and understand what actions are performed and to what kind of risks these may lead. What is the pilot vs potential future usage?
3rd party/SaaS risk management plays a critical role, as companies will most likely use external services for implementations. Have strong processes and enforcement in place with management support.
Some good advice from Microsoft: “Carefully consider well-scoped chatbot scenarios. Limiting the use of the service in chatbots to a narrow domain reduces the risk of generating unintended or undesirable responses.”
Microsoft provides other good advices for those considering implementations [15].
Big thanks for Joni Leskinen , Tuomas Mett?nen and Niki Klaus for feedback!
References:
The picture is AI Generated with StableDiffusion.
IT Certification at TIBCO
1 年Discover the secret to F5 Certification success at www.certfun.com/f5! ???? Their online practice exams are a goldmine for sharpening your skills. #CertFun #F5Success #SkillBuilding ??
Trailblazing Human and Entity Identity & Learning Visionary - Created a new legal identity architecture for humans/ AI systems/bots and leveraged this to create a new learning architecture
1 年Hi Jari, My thoughts back to you... 1. The core security issue with Ai systems and bots is an inability to instantly determine friend from foe. Skim “The Challenge with AI & Bots - Determining Friend From Foe” - https://www.dhirubhai.net/pulse/challenge-ai-bots-determining-friend-from-foe-guy-huntington/ It refers to these three articles I also suggest you skim: “* A Whopper Sized Problem- AI Systems/Bots Beginnings & Endings” - https://www.dhirubhai.net/pulse/whopper-sized-problem-guy-huntington/ * “Hives, AI, Bots & Humans - Another Whopper Sized Problem”- https://www.dhirubhai.net/pulse/hives-ai-bots-humans-another-whopper-sized-problem-guy-huntington * “AI Leveraged Smart Digital Identities of Us” - https://www.dhirubhai.net/pulse/ai-leveraged-smart-digital-identities-us-guy-huntington/ I'll continue in the next message...
IT security professional at heart. Certified CISSP, CCSP, CSSLP, CRISC, CISA, and CISM.
1 年This will be a challenge like all new technologies. I also imagine that a receptionist should get a different answer from the one given to the CEO by the AI (at least for some questions). I am quite lucky a have a close look, as Kyndryl is working with a leading AI company to create AI based digital bankers. You will open your mobile banking app and speak directly to a generative AI model that will be able to do most of the things that are done by bank clerks today. It's both thrilling and scary :)