9. AI System Attacks
Guardians of AI, by Richard Diver

9. AI System Attacks

In any sports setting there is a constant shift in the game between attack and defense. While cybersecurity is not a game, it does have the same elements of strategy playbooks, skilled operators, and technology that helps defend from a wide range of attacker behavior.

There are many ways to attack and manipulate an AI system, each one requires a different strategy but have similar methods for the detection and response. Let's explore a few here:

Insider risk and social engineering

One of the first scenarios to cover is the potential for social engineering or insider risk. By understanding the actions and information that can be exposed to a trusted individual, it is possible to map out how an attacker might use generative AI inside a company. The attack may come from existing methods such as a phishing email or instant messaging, or it may be hidden inside of content being used for analysis. As part of a Zero Trust approach, we can use the principle of "Assume Breach" and expect that at least one person is acting under the influence of social engineering techniques or may be a genuine insider risk acting on their own devious intentions. Monitoring user behaviors is critical.

Another easy opportunity is to manipulate the promptbooks that might be in use within an organization, these are pre-define prompt templates that can be used to craft a better response from the LLM with advanced prompt-engineering. Ensure the creation, storage, and use of these tools are well moderated and regularly tested.

Malicious prompts and poisoned content

Focusing on the instructions being sent to the user prompt is a great way to uncover many of the following risky actions that can occur afterwards. You can read more about the potential issues involved by reading this article on the Prompt Shield (using the Spotlight mitigation technique) for poisoned content, and this article on the Crescendo method of achieve a jailbreak.

The following diagram is used as a way to map the different variations of attack path, from initial point of entry and through the various steps required to complete the malicious activity, and then exfiltration of information. The mitigations are also applied along the same path to show how each one can play a part of a layered defense (or defense in depth).

Diagram of a threat mapping template using the three layers of the AI systems framework. Includes AI Usage, AI Application, and AI Platform
AI threat mapping template

As you build out a threat mapping diagram like this, consider how the actions may occur in the AI usage layer, the impact that can have in the AI application, and how the AI platform (and AI model) will respond.

For more advanced attacks against AI, read the book "Not with a bug, but with a sticker", by Ram Shankar Siva Kumar, and Hyrum Anderson, PhD.

Book cover for "Not with a bug, but with a sticker".
Book cover "Not with a bug, but with a sticker"


Targeting AI application and AI platform services

By targeting the AI supply chain there is an opportunity for an attack to have much greater impact than going after prompt injection one by one. By targeting the trusted components such as skills, functions, and plugins, it could be possible to impact multiple organizations and do it while remaining undetected (similar to past software supply chain issues). It is important to ensure that development of AI solutions is secure from the coding infrastructure and data sources to the 3rd party and open-source software components. The AI platform hosting the AI model can also be attacked in many creative ways:

  • Availability of services can be impacted by sustained high-volume DDoS attacks.
  • Removing access to any dependencies, including storage accounts in another service, or loss of access to information databases or code repository.
  • Access via trusted physical networks such as a company office or remote site that is not well secured, enables direct trusted access into cloud networks.
  • The use of remote connectivity software including RDP or VPN clients, anything that is designed to give administrators remote access can also be manipulated by remote attack.
  • Access via the service providers customer administration portal, or command line interface, if not properly protected with identity and access management solutions like multi-factor authentication.

Because of the nature of these services, they run 24 hours a day, 365 days a year. This provides endless opportunities to probe and test them, looking for weakness and opportunity. There are plenty of different ways to attack an AI system, ensure you think of each angle and provide several mitigations for each one. Ensure continuous testing to probe for new weakness in process and procedure, along with technical misconfiguration or oversight.

Here is my favorite quote from this chapter:


Quote by Richard Diver


The book is available now on Amazon - Guardians of AI: Building innovation with safety and security .

In the next newsletter we will explore some of the key insights from Chapter 10: AI System Defense.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了