How to Protect AI From Hackers

How to Protect AI From Hackers

Thoughts about digital transformation and AI for enterprise leaders and their legal & compliance advisors

These posts represent my personal views on enterprise governance, regulatory compliance, and legal or ethical issues that arise in digital transformation projects powered by the cloud and artificial intelligence. Unless otherwise indicated, they do not represent the official views of Microsoft.

No alt text provided for this image

A little over a year ago I wrote about some remarkable work from MIT and Microsoft Research showing that when AI systems are not properly trained their judgments can be not only wrong but deeply unfair. The researchers found that several online services performing gender classification of user-submitted photos (including a service offered by Microsoft) were much more likely to make mistakes on dark-skinned women than on light-skinned men.

As I wrote at the time, this unexpected outcome, which received front-page coverage in the New York Times, shocked the conscience of the AI community. Fortunately, it also spurred the community to think deeply about how to fix the problem—a key step is to ensure that the image examples used to train AI face analysis algorithms are representative of the full diversity of human faces. My view is that this incident was ultimately a win for AI because it showed that its mistakes can be identified and corrected, at least in principle. I’m not so sure the same can always be said for humans.

But evaluating the performance of AI systems is a vast topic, and fairness is only one issue among many that need to be considered. Another critical issue is security. Now that AI is being widely deployed not only on consumer websites but in many mission-critical applications, ensuring that it is properly protected from malicious attacks is of paramount importance.

I don’t mean to suggest that AI systems are unusually vulnerable to hacking—there is no evidence that they are more at risk than all the conventional pre-AI software systems that we surround ourselves with, whether it be those that run our electrical grids, our hospitals, our financial institutions, or our smartphones. But it is true that AI’s vulnerabilities are of a specific kind that demand specific defensive measures.

To address these concerns Microsoft engineers have been working with researchers at Harvard to develop a standard classification of security threats to AI and machine learning systems along with ways to mitigate them. In a summary of their work published last week entitled “Failure Modes in Machine Learning,” the group emphasizes that they are addressing not only engineers but also lawyers and policymakers. Legal and compliance leaders in organizations working on AI projects will want to be aware of this work as well.

In what follows I provide several illustrations of possible attacks on AI systems and the defensive measures that Microsoft engineers are developing (or in some cases already implementing). All involve hackers trying to trick or even steal the underlying model used by an AI to make classification judgments.

No alt text provided for this image

When an AI reveals too much about what it knows, hackers might be able to leverage that information to extract private information from it (a so-called “model inversion attack”). In one striking example, researchers were able to use an AI’s answers to face recognition queries to reconstruct what a person’s photo looked like using only the person’s name. The researchers exploited the fact that when they submitted a face image and a name to the AI, it not only said whether the pairing was correct but also gave a confidence score. By repeatedly adjusting an algorithmically generated image and observing changes in the confidence score, they were able to reverse engineer a reasonable approximation to the true image associated with a given name, even though they had no access to that image.

A related attack goes a step further: instead of prying private information out of the system one piece at a time, it bombards the system with repeated queries until it can replicate the entire AI model that the target system uses. Once the hackers have that model, they can use it in any way they like. This has worrisome implications for privacy. In one study, researchers reverse-engineered the model used by an online credit scoring service, enabling them to guess what score the service would give to any loan applicant.

But perhaps the most dangerous form of attack of all is the creation of adversarial examples that can trick an AI system into making wrong judgments—sometimes dangerously wrong. An adversarial example is one that has been crafted to exploit the fact that neural networks (the basic algorithm family used in most modern AI), despite their remarkable performance on such tasks as object and face recognition or machine translation, perceive the world in fundamentally different ways than humans. Because they respond to subtle statistical patterns in data that a human cannot see, they can sometimes be tricked into seeing things that are literally not there.

This image shows a seemingly random patch of visual noise (on the left) that has been crafted to trigger an object recognition network into mistaking it for a bus.

No alt text provided for this image


In another study, a research team showed a 3D printed turtle whose texture had been optimized by trial and error to trick an image classifier (also based on a neural network) into mistakenly identifying the turtle as a rifle. The brief video clip below shows that the network is extremely confident in a mistaken judgment that to human eyes is absurd.

No alt text provided for this image
(click here for video)

These examples may suggest to technology skeptics that AI and especially neural nets are too risky to be deployed in real-life scenarios. But this conclusion is unjustified. What I haven’t told you is that all of the above scenarios exist only in the laboratory as academic experiments—to my knowledge, none have yet occurred in the real world, which is precisely why we must be vigilant.

There is more good news. Microsoft and many other researchers are hard at work on a wide range of proactive and reactive defensive measures to block malicious attacks that might use these methods against real-world AI systems. Some of these measures are easy to implement, such as blocking users of online AI services from submitting too many requests in a short span of time (since such requests can be used to probe the internal structure of the AI models). Others involve deliberately manufacturing large numbers of adversarial examples so that AIs can be trained to recognize them as dangerous. Still others require modifications to the AI algorithms themselves to make them more robust to known modes of failure.

We are still in the early days of widespread deployment of AI models in business and consumer applications. Legal and compliance leaders in organizations that are working on such projects should spend time now learning about the strengths and weaknesses of these models. Making certain that they are secure is not a purely technical question, but will require new kinds of contracts, user agreements, privacy policies and, yes, new kinds of government regulation to address these issues. The Microsoft team that conducted the work described here hopes that it will enable a steady stream of innovations in AI that benefit society while ensuring that it remains safe and secure.

No alt text provided for this image

Microsoft has published a book about how to manage the thorny cybersecurity, privacy, and regulatory compliance issues that can arise in cloud-based Digital Transformation—including a section on artificial intelligence. The book explains key topics in clear language and is full of actionable advice for enterprise leaders. Click here to download a copy. Kindle version available as well here.




要查看或添加评论,请登录

Michael McLoughlin ?的更多文章

社区洞察

其他会员也浏览了