AI package hallucinations a note from our CTO
Vulcan Cyber, a Tenable Company
Manage and reduce attack surface risk, through a single platform.
Author:? Roy Horev , Co-Founder & CTO at?Vulcan Cyber
Voyager18, our talented research team, recently introduced a new attack technique called AI Package hallucination.
This combines two dangerous vectors that almost every engineering manager and every security practitioner is aware of:
?
1.????Relying on open-source libraries
Everyone recognizes the convenience of utilizing third-party software to save time and concentrate on business logic. It's an obvious choice given the time efficiency, improved functionality, maintainability, free support, and a myriad of other benefits it offers. The downside? Developers seldom take the extra time to thoroughly inspect the downstream code to understand it, often overlooking any hidden suspicious elements.
?
2.????The rapid adoption of generative AI tools?
These tools are widely used to save time during coding and debugging - it's a no-brainer given the time and hassle they save us. However, as our research demonstrated, reliance on these tools and libraries can open the door to threat actors.
So we have two axioms we want to move forward with:
A. We can’t afford, and do not want, to give up the benefits of using AI, and
OSS.
B. We need some kind of aid to help with the potential risk of the package
hallucinations.
We wanted to identify the steps we could take which wouldn't hinder our development speed (and morale), but would alert us to potential threats.
?
Establishing legitimacy
We’ll keep all the "softer" measures out of this article (training, code reviews, etc.) so we can focus on some technical solutions to help us with the detection of possible issues. When introducing new sanity guidelines, we usually prefer the CI method, ensuring possible issues are caught before they hit production.
领英推荐
We decided we wanted an automated bot to go over code changes during the merge request process, and try and determine whether an import looks legitimate or not. For the purposes of this article, we won't delve into the technologies and code we actually use. Instead, let's turn to the tool of the hour – ChatGPT – for assistance.
First, let's generate some code that can get the pull request, and get the added package names from it.
To complete the process, we need to create a score to define the legitimacy of a package, on a scale from 1-100. This will allow us to configure a threshold that suits our security appetite and alert based on it.
We decided to build the score of three parameters:
1.?????Quality
2.????Community Engagement
3.????Activity?
We effectively built a weighted score model, so we could get a final result of how legitimate the package is – and make a decision on whether to block the pull request or let it slide.
?
Next steps
We haven’t seen the last of security risks emerging from generative AI.
And while this doesn’t solve every problem, it certainly is a start.
If you’d like to hear more about AI package hallucination, join Bar Lanyado as he takes us through his research at our webinar on June 21st.
Sign up for free: