课程: Security Risks in AI and Machine Learning: Categorizing Attacks and Failure Modes

Supply chain attacks

- [Instructor] With all of the great code plugins and libraries available for use, it's rare that anyone writes a new app from scratch these days. If you need map functionality in your app, you don't write a mapping system from scratch, you embed one of the many popular map services or APIs. Developers speed up delivery by leveraging each other's work thanks to a robust and complex software supply chain. A supply chain that includes AI and ML development too thanks to pretrained models and Model Zoos. One of the most well known pretrained models is GPT-3: Generative Pretrained Transformer 3. GPT-3 generates written text when it is queried with certain prompts. So for example, if you want to have a poem written about COVID, you could ask GPT-3 to do that for you. How about a cookie recipe or a new song? GPT-3 can do that too. It would be cost prohibitive for most companies to train their own language generation model, but using GPT-3 is fast and easy. Model Zoos are also part of the ML and AI supply chain. They're repositories of pretrained models that can be used to augment your app with ML or to provide additional functionality. The ONNX: Open Neural Network Exchange is an open set of common operators and formats that enables developers to use models with different tools and frameworks. ONNX also has its own Model Zoo of pretrained models and tools to help developers convert their existing models to ONNX. Model Zoos are a really valuable way for developers to incorporate ML in their solutions and leverage the work of others without having to reinvent the ML wheel. GitHub also hosts the list of CoreML models that are available for use by scientists and developers. So Model Zoos and pretrained models are a fantastic resource, but like any component in the software lifecycle, they must be used wisely. Savvy attackers could tamper with models in a number of ways, including poisoning the data in the model or inserting malicious code into it and backdoors. A model with poison data will not produce accurate outcomes, putting the integrity of your entire application at risk and a model with backdoors could mean that your sensitive data might be leaked to the attacker. To keep your AI and ML supply chain safe, only use pretrained models from trusted sources and ensure that they have been properly vetted and tested before use.

内容