Why AI needs a red team
At Google, we believe that part of building AI responsibly means testing it for security weaknesses, including using red teams to evaluate how AI technology can stand up to realistic threats – which is why we published our first AI Red Team report .
Below we present an interview between Google Cloud CISO Phil Venables and Royal Hansen, vice president of Privacy, Safety, and Security Engineering at Google, on the report’s findings — and why AI needs red teams. They also discussed Google’s overall progress in AI and security.
Phil Venables: Let’s start by talking about why AI needs a red team.??
Royal Hansen: I’m really excited about this. At Google, we believe that red teaming — friendly hackers tasked with looking for security weaknesses in technology — will play a decisive role in preparing every organization for attacks on AI systems. Google has been an AI-first company for many years now, and this paper shows how red teaming is a core component of securing AI technologies.?
It focuses on three important areas: 1) what red teaming in the context of AI systems is and why it is important; 2) what types of attacks AI red teams simulate; and 3) lessons we have learned that we can share with others.
PV: Our team has a singular mission, to simulate threat actors targeting AI deployments. What kinds of attacks is the red team simulating??
RH: The AI Red Team is focusing squarely on attacks on AI systems. We detail in the report six tactics, techniques, and procedures (TTP) that attackers are likely to use against AI: prompt attacks, extraction of training data, backdooring the AI model, adversarial examples to trick the model, data poisoning, and exfiltration.?
Since AI systems often exist as part of a larger whole, we do stress that AI Red Team TTPs should be used along with traditional red team exercises. A good example of this is how our AI Red Team has worked with our Trust and Safety team to help prevent content abuse.
PV: Can you talk about what we learned from the report??
RH: Sure. I’ll start with some tactical lessons.?
We know that to protect against many kinds of attacks, traditional security controls such as ensuring the systems and models are properly locked down can significantly mitigate risk. This is true in particular for protecting the integrity of AI models throughout their lifecycle, which can help prevent data poisoning and backdoor attacks.?
It was helpful to learn that many attacks on AI systems can be detected in the same way as traditional attacks. But others — including prompt attacks and content issues — may require layering multiple safety models. Traditional security philosophies, such as validating and sanitizing both input and output to the models, still apply in the AI space.
From a higher-level point of view, addressing red team findings can be challenging, and some attacks may not have simple fixes. We encourage red teams to partner with security and AI subject matter experts for realistic end-to-end adversarial simulations.
PV: Let’s pull back the lens a bit and look at how we got here with AI, which has really dominated the technology conversation this year. How would you describe its evolution — particularly during your time at Google??
RH: AI isn’t new — we’ve been incorporating AI into our products for more than a decade. If you’ve been using Google Search, or Translate, or Maps, or Gmail, or the Play Store for apps, you’ve been using and benefiting from AI for years.?
领英推荐
One of our AI milestones includes using machine learning to help detect anomalies on our internal networks back in 2011. Today, those capabilities have evolved and regularly help our red teams discover and test sophisticated hacking techniques against Google’s own systems.?
In 2014, we started a Machine Learning Fairness team . In 2018, we adopted our AI principles , which led to spearheading the movement to adopt responsible AI, based on mitigating complexities and risks, and also improving people's lives while addressing social challenges.?
This year, we built on our collaborative approach to cybersecurity by launching our Secure AI Framework (SAIF). SAIF is inspired by best practices for security that we’ve applied to software development, while incorporating our understanding of security megatrends and risks specific to AI systems .?
SAIF is designed to help mitigate risks specific to AI systems like stealing the model , poisoning the training data , injecting malicious inputs through prompt injection , and extracting confidential information in the training data .?
So, while there’s a lot of discussion about generative AI in cybersecurity – and beyond – right now, we’ve been using and learning from AI more broadly in our day-to-day work for years.?
PV: How can we ensure a higher quality of online information, particularly in critical situations such as moments of crisis and war, or elections? How are you thinking about security and protections in these moments that matter, particularly in the age of AI??
RH: Technology can create new threats, but it can also help us fight them. AI can often help counter the issues created by AI. It could even give security defenders the upper hand over attackers for the first time since the creation of the internet.?
For example, Gmail uses AI right now to automatically block more than 99.9% of malware, phishing, and spam, and protects more than 1.5 billion inboxes. AI can help identify and track misinformation, disinformation, and manipulated media. One notable example of that happened last year, when Mandiant discovered and sounded the alarm about the AI-generated “deepfake” video impersonating Ukrainian President Volodymyr Zelensky surrendering to Russia.
We already use machine learning to identify toxic comments and problematic videos . More technical AI innovations we’re working on include watermarking AI-generated images, and creating tools to evaluate online information — like the upcoming “About this Image ” feature in Google Search. We've also joined the Partnership on AI’s Responsible Practices for Synthetic Media , which promotes responsible practices in the development, creation, and sharing of media created with generative AI.
Looking ahead, our challenge is to put appropriate controls in place to prevent malicious use of AI and to work collectively to address bad actors, while maximizing the potential benefits of AI to stay at the front of the global competitiveness race.?
PV: What work is Google doing to manage risks we might face from AI?
RH: We think about AI and security primarily through two lenses. First, using AI to enhance safety and security, and second, securing AI from attack.
While frontier AI models offer tremendous promise to improve the world, governments and industry agree that appropriate guardrails are required on the policy level, on the business level, and on the technology level.?
Their development and deployment will require significant care — including potential new regulatory requirements. We’ve already seen important contributions to these efforts by the U.S. and U.K. governments, the European Union, the G7 through the Hiroshima AI Process, and others. To build on these efforts, further work is needed on safety standards and evaluations to ensure advanced AI systems are developed and deployed responsibly.
The lines are blurring between safety and security in ways that require us to collaborate across cyber and trust and safety, across consumer and enterprise, across public and private sector, national and international.?
With the stakes so high, we’re calling on governments, the private sector, academia, and civil society to work together on a responsible AI policy agenda. And to enable progress in AI, we must focus on three key areas: opportunity, responsibility, and security.
Read Phil and Royal’s full interview here .
CEO & Co-Founder at Detoxio, Detox your GenAI
7 个月A very Valid Question beautifully answered by Google Research Team. It seems Red Teaming GenAI is an evolving concept and there is lack of common understanding. Manual Red teaming , though very creative and valuable, should be augmented by Automated Red Teaming to find out the full coverage. I have made an attempt to define Automated LLM red teaming . Check out the blog and let me know your views https://open.substack.com/pub/detoxioai/p/ai-driven-automated-llm-red-teaming?utm_source=share&utm_medium=android&r=2wroxs
Business Delevoper | Sales Analyst at Qualyteam | Inside Sales Professional | SDR | BDR | Account Executive | Hunter | B2B | Technology | Software | MBA Gest?o de Negócios | USP/Esalq
10 个月Awesome!
--
11 个月Know who your Friends are!
AI Engineer
1 年?? Join Us at the Africa AI Summit 2023! Are you ready to be part of a groundbreaking event that's shaping the future of AI across the African continent? The Africa AI Summit 2023 is here to inspire, connect, and transform. ?? Innovation: Discover the latest trends in artificial intelligence, machine learning, and data science. Unleash your creativity and explore limitless possibilities! ?? Impact: Learn how AI is making a difference in healthcare, education, finance, and more. See how technology can change lives and drive sustainable growth. ?? Stakeholder Partnerships: Network with industry leaders, policymakers, and entrepreneurs. Forge meaningful connections that will drive AI forward in Africa. Join us for insightful discussions, keynote speakers, startup showcases, and hands-on workshops. It's not just an event; it's a movement! ?? Date: 11/10/2023 ??? Location: Virtual ?? Register now; https://lnkd.in/e_4Vt_sg Don't miss this opportunity to be at the forefront of AI innovation. Let's shape the future of Africa together. ??
Driving Operational Excellence: Cost Control Specialist | QA Leader | Document Management Expert | Aconex | Transforming Operations with Precise Cost Control Charts, Dynamic Dashboards, and Streamlined Documentation.
1 年A remarkable convergence of responsible hacking and responsible AI! ???? Google's commitment to building AI responsibly shines through in its innovative approach. Security-testing AI systems through red teams is a proactive stride towards ensuring AI's robustness against real-world threats. Kudos for transparency in sharing findings from the first AI Red Team report, fostering a community of knowledge and collaboration. The conversation between Phil Venables and Royal Hansen underscores the importance of a security-first mindset in AI development, emphasizing that progress must go hand-in-hand with safeguarding. This initiative sets a remarkable example in the tech industry, demonstrating that responsible AI isn't just a goal, but a commitment woven into every facet of advancement. ????? #ResponsibleAI #SecurityInnovation #TechLeadership #CollaborativeAdvancement