Software quality problems and AI in the desert
David Stocks
Leading Germane Advisory, a specialist cyber security, privacy, and data/AI governance firm.
With some distance from the 45 degree heat of Las Vegas, I’ve had some time to think about the two main themes that kept recurring across the week. I also went well past the character limit on a post, so here we are in article form.
A software quality problem?
Bob Lord (former CISO @ the Democratic Party / Yahoo / Twitter, current Senior Technical Advisor at Cybersecurity and Infrastructure Security Agency ), has long-argued that the root cause of most incidents is poor software quality.?
This was a consistent theme repeated by US Government stakeholders in particular - CISA Director Jen Easterly echoed this language repeatedly across Black Hat and DEFCON, suggesting a better term for benign-sounding “vulnerabilities” would be “product defects”. She also argued for changes to software liability that would incentivise software vendors to adopt secure development processes and provide an avenue for recourse if they didn’t. Speaking about this issue generally, Heather Adkins concluded that “our ability to win will depend on not letting the bad guys see as many vulnerabilities in the future.”
Addressing software quality problems was also the focus of the AIxCC Village at Defcon. AIxCC is the Defense Advanced Research Projects Agency (DARPA) AI Cyber Challenge, a $29.5 million USD competition where teams design AI systems that identify and remediate software flaws to make software safer. The semi-final occurred at Defcon this year, with the seven finalists selected to compete at next year’s Defcon. After the competition concludes, finalists are required to open-source the systems they develop. I think this is really exciting and in the long term could really change the running gun-battle that is vulnerability management.
AI Fatigue / AI Hope
Yes, every software vendor in the whole city was boasting about AI-powered gains that would give their product wings/gain sentience/practice magic. People who have heard many of these pitches are naturally exhausted . It is tempting to mentally file AI in the same basket as other terms beloved by marketers like “military-grade”, “zero-trust”, “100% MITRE ATT&CK coverage”, or “single pane of glass”.?
I think part of what makes the “we are using AI to …” claims a bit more nuanced is that they're often true, particularly in the types of products that have been using ML algorithms for many years - they’re just not the hot new thing.?
Countering the AI fatigue were repeated examples of LLMs being used for good in use cases that just didn’t exist this time last year. As LLMs came to the fore in the last couple of years, a lot of security teams and vendors looked at ways they could use them to advance security outcomes and went and piloted them. And then came and spoke about it or demoed the fruits of their labour.??
Inside the AIxCC village, Google Cloud was showing off Google Security Operations (formally known? as Chronicle) and how they’d integrated Gemini into the product to enable analysts to write natural language queries that would then be translated into a query language, among other enhancements.
领英推荐
Anthropic , makers of Claude 3.5 Sonnet, were showing off an integration they’d built into GitLab to identify vulnerabilities in code and submit merge requests that aimed to fix the vulnerability (another talk I went to during the week suggested these sometimes affected functionality, so perhaps some work to do).
Instacart ’s security team talked about their journey to reduce the amount of long-lived access to their critical systems, and how they’d leveraged ChatGPT to automate the “grey area” of just in time access reviews. By using an LLM to review roles, privileges, role and policy statements, they were able to automate some decisions and provide context to decision makers for other requests. They’ve published some of their work here on Github.
Other talks touched on reviewing other corpora of security data (past tickets, alerts, etc) to add context to issues/events that popped up today.?
There are absolutely some serious risks to address with the adoption of LLMs that work alongside us, perhaps best highlighted by Michael Bargury of Zenity , and their presentations about live weaknesses in M365 Copilot that could lead from one adversarial inbound email to serious confidentiality or integrity impacts. Their excellent work is described on further on their blog.
Despite some cause for caution, I left the desert broadly optimistic about some of the gains and real utility that has been extracted in a relatively short period of time. There are plenty of offensive use cases, but I think the next few years are going to give defenders new tools to improve our efficiency and speed. There are some things we can do to handle the variability that can sometimes come from these techniques, and it’s also worth being humble about where we’re at with our existing processes - the Instacart team rightfully pointed out that they were trying to improve things from the 99% acceptance rate that manager approval of access requests currently produces.
US Election side-note
Lastly, I am too much of a politics / international relations nerd to not mention the US election, which was reaching a high tempo in the campaign as we visited the swing-state of Nevada.?
CISA and various state/local election officials speaking at Defcon’s voting village spoke confidently about the security and integrity of the upcoming election.?
A presidential election campaign was hacked, but the impact and press coverage was significantly tamped down compared to the 2016 DNC hack. It’s hard to tell if we’re learning to not give adversarial countries what they’re seeking, or if there’s not much the affected campaign would say in internal emails that they wouldn’t say publicly.?
Finally, the amount of AI-generated imagery, audio, and video being generated around election topics and protagonists is substantial. It’s hard to tell if people are really being persuaded by these or recognise them to be artificial. Optimistically, it’s the one big election where the tools to produce stuff is there but the watermarking / detection capabilities aren’t yet as widely rolled out. It’ll certainly be an interesting area of study for academics next year.