9. AI System Attacks
In any sports setting there is a constant shift in the game between attack and defense. While cybersecurity is not a game, it does have the same elements of strategy playbooks, skilled operators, and technology that helps defend from a wide range of attacker behavior.
There are many ways to attack and manipulate an AI system, each one requires a different strategy but have similar methods for the detection and response. Let's explore a few here:
Insider risk and social engineering
One of the first scenarios to cover is the potential for social engineering or insider risk. By understanding the actions and information that can be exposed to a trusted individual, it is possible to map out how an attacker might use generative AI inside a company. The attack may come from existing methods such as a phishing email or instant messaging, or it may be hidden inside of content being used for analysis. As part of a Zero Trust approach, we can use the principle of "Assume Breach" and expect that at least one person is acting under the influence of social engineering techniques or may be a genuine insider risk acting on their own devious intentions. Monitoring user behaviors is critical.
Another easy opportunity is to manipulate the promptbooks that might be in use within an organization, these are pre-define prompt templates that can be used to craft a better response from the LLM with advanced prompt-engineering. Ensure the creation, storage, and use of these tools are well moderated and regularly tested.
Malicious prompts and poisoned content
Focusing on the instructions being sent to the user prompt is a great way to uncover many of the following risky actions that can occur afterwards. You can read more about the potential issues involved by reading this article on the Prompt Shield (using the Spotlight mitigation technique) for poisoned content, and this article on the Crescendo method of achieve a jailbreak.
The following diagram is used as a way to map the different variations of attack path, from initial point of entry and through the various steps required to complete the malicious activity, and then exfiltration of information. The mitigations are also applied along the same path to show how each one can play a part of a layered defense (or defense in depth).
As you build out a threat mapping diagram like this, consider how the actions may occur in the AI usage layer, the impact that can have in the AI application, and how the AI platform (and AI model) will respond.
For more advanced attacks against AI, read the book "Not with a bug, but with a sticker", by Ram Shankar Siva Kumar, and Hyrum Anderson, PhD.
领英推荐
Targeting AI application and AI platform services
By targeting the AI supply chain there is an opportunity for an attack to have much greater impact than going after prompt injection one by one. By targeting the trusted components such as skills, functions, and plugins, it could be possible to impact multiple organizations and do it while remaining undetected (similar to past software supply chain issues). It is important to ensure that development of AI solutions is secure from the coding infrastructure and data sources to the 3rd party and open-source software components. The AI platform hosting the AI model can also be attacked in many creative ways:
Because of the nature of these services, they run 24 hours a day, 365 days a year. This provides endless opportunities to probe and test them, looking for weakness and opportunity. There are plenty of different ways to attack an AI system, ensure you think of each angle and provide several mitigations for each one. Ensure continuous testing to probe for new weakness in process and procedure, along with technical misconfiguration or oversight.
Here is my favorite quote from this chapter:
The book is available now on Amazon - Guardians of AI: Building innovation with safety and security .
In the next newsletter we will explore some of the key insights from Chapter 10: AI System Defense.