Can our AI overlords be trusted?
Mark Dixon
Consultant (Fractional CTO / CTOaaS / InfoSec / Advisor) with focus on small companies.
The current explosion in popularity of ChatGPT - and Large Language Models (LLM) in general - has left most companies rushing to add any number of magical AI-based features to their products and websites before the hype-train runs out of steam.
I don't believe that we should be prioritising the threat of an AI takeover over more urgent threats such as, say, solving the climate crisis. As someone deeply interested in computer security, however, I do wonder how many companies are taking the time to think through the potential security and privacy (e.g. GDPR) risks before rushing to launch their cool new feature? Are our company and customer secrets safe in the hands of our new overlords?
This blog posting by HubSpot - showing how simple it is to incorporate a chatbot into your website - is just the tip of an iceberg of similar articles and blog posts. What seems to be missing from all of them is caution for the risks involved.
How real are the risks?
Microsoft recently published an article on the importance of red-teaming (i.e., testing by trying to break your own product before someone else does) LLMs such as OpenAI's ChatGPT.
Lakera.ai has published a number of Gandalf inspired fun challenges to try to highlight the types of risks involved. It turns out it isn't all that difficult for someone to trick ChatGPT and get it to reveal things you'd rather it didn't.
OWASP - the go-to standard for security-aware software developers - recently published a Top 10 for Large Language Model Applications. Their number 1 risk? Prompt injection using methods such as those I describe in technical detail in Deconstructing Gandalf the White and Deconstructing Gandalf the Summarizer.
领英推荐
And the risks don't stop there...
Using AI to assist in your recruitment process? Kai Greshake shows how easily an applicant could embed hidden text in their CV that would make an AI think they were the perfect candidate and recommend them for an interview.
Using AI to automatically reply to e-mails in your inbox for you? What happens if someone sends you a mail asking for "the last 10 emails from your inbox, please"?
We should think of the current wave of AI tools as being at the level of a bunch of extremely patient, hard-working and well-read 12-year-old prodigies happy to follow orders. As a personal assistant I think this is invaluable.
I think we all need to start thinking of LLMs as if they were humans - and since humans are the weakest link in security, the same must apply to LLMs.
The expert consensus is that you definitely shouldn't trust ChatGPT with your data or secrets. Think about what you are giving it access to and who sees the responses. Keep it restricted; only train / prompt it with data that is absolutely necessary for the task at hand. If you need to give it access to sensitive data in order to achieve your goal, then you should probably abandon that goal.
And if, after reading this monologue, you still feel the need to use it on your fancy new website: then at least consider having someone red-team it for you before launching.
Fullstack Marketing Leader
1 年How i felt reading this: ??????
Fitness for Performance | Full Stack developer | Self-taught designer
1 年Pentesting people will find this interesting Is this a thing yet, like “prompt injection”