Securing AI Agents: 4 Controls for Responsible Development
Security professionals and data scientists, in particular, need to secure highly capable AI systems. While there is much discussion about AGI, our focus, especially in the security industry, should be on Artificial Capable Intelligence. ACI is already here, has been for over a year, and is expected to grow exponentially.
In practical terms, this growth will be fueled by the introduction of Neural Processing Units (NPUs) to the market in the coming months. NPUs are chips that measure performance in Terra Flops and allow quick model inference. They will facilitate the development and deployment of local AI models, introducing a new paradigm in technical risk controls.
We are already witnessing the development and deployment of AI Systems (like ChatGPT) and Agents (such as automation tools like Zapier). These systems have varying levels of autonomy and operational scope, and it's crucial to maintain clear transparency in the actions performed by AI Agents and Systems.
Clearly, this technology presents both opportunities and threats. The conversation around existential risks is well-documented. However, the real challenge lies not in low probability, high impact risks, but in whether AI Systems and Agents are developed and deployed responsibly.
People will delegate decisions to these systems, as is already done in some contexts like SEO or market tracking funds, and ask them to take actions like displaying relevant information or making trades.
This isn't about the models themselves, as there is plenty of ongoing work to ensure models do not exhibit bias or facilitate harmful actions. It's about the systems in which they operate: What context is being added to these models? How do we ensure that this context is secure? It's likely that hackers are viewing these systems as potential vulnerabilities.
Security professionals need to consider new areas for ACI (Artificial Capable Intelligence):
- Task Specificity: ACIs' roles should be clearly defined, whether intended for specific tasks or broader general capabilities. The risk profile for a general-purpose agent differs significantly from that of a narrow agent.
领英推荐
- Operational Scope: The proactive and reactive capabilities of ACIs must be delineated to understand their potential impact and the security measures necessary to govern their actions. General purpose proactive agents are likely to carry greater risk as they will be unbounded and very capable.
- Decision Autonomy: The level of human oversight required for ACIs must be carefully considered to balance efficiency with the need for control and accountability.
- Transparency and Accountability: Clear mechanisms for transparency in ACIs' decision-making processes and accountability for their actions are essential to ensure trust and ethical alignment with societal values.
The security community needs to be threat modeling AI Systems. What happens if a memory component gets compromised? That means an attack can insert or delete memories into someone's personal assistant. How do we detect that? How do we secure a memory component? We have current methods for hardening these components, but do we need more?
Thanks for reading,
Matt
#ContextIsAllYouNeed #ResponsibleAI #SecureByDesign?
Trailblazing Human and Entity Identity & Learning Visionary - Created a new legal identity architecture for humans/ AI systems/bots and leveraged this to create a new learning architecture
8 个月Hi Matt, My take on AI security is quite different than most others. If you'd like to know why, read on... 1. First skm these articles: “* The Challenge with AI & Bots - Determining Friend From Foe” - https://www.dhirubhai.net/pulse/challenge-ai-bots-determining-friend-from-foe-guy-huntington/ * “A Whopper Sized Problem- AI Systems/Bots Beginnings & Endings” - https://www.dhirubhai.net/pulse/whopper-sized-problem-guy-huntington/ * “Hives, AI, Bots & Humans - Another Whopper Sized Problem”- https://www.dhirubhai.net/pulse/hives-ai-bots-humans-another-whopper-sized-problem-guy-huntington * “AI Leveraged Smart Digital Identities of Us” - https://www.dhirubhai.net/pulse/ai-leveraged-smart-digital-identities-us-guy-huntington/ My premise? Without being able to instantly determine entity friend from foe, down in the security, legal, and identity weeds, models and systems won't work well. I'll continue in the next message...