Prompt Shield (preview), to protect from Direct or Indirect Prompt injection attack.
Ivana Tilca
Lead Manager @ Allata | Microsoft MVP in Artificial Intelligence | Technology Advocate I Speaker I World Traveler
Prompt Shields is a unified API that analyzes LLM inputs and detects User Prompt attacks and Document attacks. Here are two common types of adversarial inputs:
Why are Indirect Prompt Attacks different than Direct Prompt Attacks?
First of all, they have different threat models.
In Direct Prompt Attacks.
The Indirect Attacks
What is Prompt Shields and how can help prevent attacks?
Prompt Shields seamlessly integrate with Azure OpenAI Service content filters and are available in Azure AI Content Safety, providing a robust defense against these different types of attacks. By leveraging advanced machine learning algorithms and natural language processing, Prompt Shields effectively identify and neutralizes potential threats in user prompts and third-party data. This cutting-edge capability will support the security and integrity of your AI applications, safeguarding your systems against malicious attempts at manipulation or exploitation.??
领英推荐
Limitations:
Currently, the Prompt Shields API supports the English language.
The maximum character limit for Prompt Shields allows for a user prompt of up to 10,000 characters, while the document array is restricted to a maximum of 5 documents with a combined total not exceeding 10,000 characters.
Benefits of Prompt Shields:
Conclusion
Prompt Shields serves as a vital tool in safeguarding against both Direct and Indirect Prompt Attacks by providing robust detection mechanisms within the LLM environment.
This ensures the security and integrity of AI applications by identifying and neutralizing potential threats, thereby preventing malicious manipulation or exploitation.