Beyond Luck: Navigating the Maze of Software Assurance

Beyond Luck: Navigating the Maze of Software Assurance

The discovery of CVE-2024-3094, a.k.a. the xz or liblzma backdoor, has re-ignited a crucial conversation on software assurance.? There are plenty of articles already which describe this stealthy and sophisticated attempt, most likely by a nation state actor, to insert a highly privileged remote code execution capability into computers around the world.

The focus of this article is on some of the ramifications of this latest supply chain attack.


First, however, please bear with me while I provide a small bit of background for those who are not already intimately familiar with this attack:



In a nutshell, a hostile actor surreptitiously ingratiated themselves within the open source community in 2022-2023, using a series of fake personas established as early as 2021, and in so doing essentially commandeered (a year ago) the responsibility for supervising a particular library of code, which is itself incorporated into some of the most highly privileged and sensitive pieces of code which run on Linux operating systems today.? Over the last year, additions were made to the code which culminated in this backdoor code making it into the early-release versions of some major Linux distributions. Left unchecked, this backdoor code might have been expected to expose a very large population of computers globally to receive instructions to perform tasks at the attacker’s whim, and to do so using a protocol (ssh) which is commonly enabled and exposed to be able to receive such instructions over networks like the internet.? Last week, through a lucky series of coincidences, this attack was discovered, by a vigilant database software developer doing entirely unrelated performance engineering work.


It was a close-run thing, which would have been a triumph for the hostile intelligence service who ran the operation. Had they gone undetected for a little while longer, countless computers worldwide would have been within their grasp. To be clear, if there is an aspect to this operation which demands understanding and respect, it is not the code, it is not the technology, it is the development of the human relationships and trusts which enabled the integration of this code into the very heart of our computer operating systems today.



The Hidden Dangers in our Digital Foundations

Most people — or at least most people who are not software engineers — may not think of writing code as being a matter of luck.? A Turing machine, so the received wisdom goes, simply does what it is told, in the order in which it is told, and hence the computer program either works or it does not.? If you have instrumented your code sufficiently and have implemented sufficient testing capabilities then you may even hope to know (before you use your program for real work) whether or not its behavior will be as you intended.

As a mental model of IT systems this is all well and good.? Granted, it does not quite take account of the complexity and fragility of enterprise systems, which implement and connect computer code in ways which are brittle and which from time to time introduce behaviors that no single person may have anticipated.

In reality, though, the situation is far worse than that.

Almost entirely gone are the days when a new IT system involved the bringing together of a sizable team of programmers and engineers to write and construct the system from whole cloth.? Nor, even, do we typically rely upon a small team of programmers who themselves obtain trusted and well-scrutinized libraries from reputable vendors with a deep bench of their own programmers, testers and reviewers.? What we have all too often today is an ecosystem wherein the partially blind, including a good number of remarkably talented programmers, lead the blind.

Information technology vendors, even large and ostensibly reputable software companies, publish and sell products which are themselves the composition of code from a great variety of sources.? Corporations shove this code together, without fully understanding why or what they are doing, and are delighted if and when everything works more or less as they fancied it might.? Unnecessary or even outright inappropriate components will frequently be included, and whenever something does not go according to plan, it is often the case that additional code is inserted which one insufficiently-skilled programmer may have copied from another unskilled programmer’s post on the internet, or which was coughed up by an AI engine without its operator understanding precisely what is passing through their fingers.


At this point, you may be in one of a few camps: for example, you may be saying,

"Goodness, this seems like such hyperbole.? It couldn’t possibly be this bad!"

I regret to say that by and large, especially in the case of larger enterprises, it is.? I have been teaching programmers software security since the nineties, and in that period I have witnessed the comprehension of — and general preparedness to comprehend — security fundamentals getting worse, not better.?

(Meanwhile, great strides forward have been made in some complementary areas, including some which have allowed us to declare victory over a large swath of low-hanging vulnerabilities.)


Or perhaps you already appreciate the truth of the foregoing commentary, but are predicting what I will say next, and are thinking to yourself,

“Oh dear, is this going to turn into another rant against open-source software?”

Not at all: that this attack used an open-source library is just a manifestation of reality, which is that open-source libraries form a big part of our codebases today!?

While it is true that this recent backdoor was injected via an open-source process, using a so-called “supply chain injection” — which is to say, inserting malicious code covertly into someone’s software which is used in (i.e. is part of the “supply chain” for) another IT software or system, — this type of attack is by no means peculiar to open-source code, as we have indeed seen with other recent events such as the 2019/2020 Solarwinds attack, which compromised a for-profit software manufacturer’s released product as a way to infiltrate malicious behaviors into Solarwind’s customers’ environments.


Incremental Improvements

Let us accept reality: let us take it for granted that an attacker may find a way to insert their own code into a library or other software component which you use in your systems.

What, then, should we learn from this recent attack?? I dare say there are many lessons, ranging from the detailed and technically-specific, to the more general.


Some lessons will focus on the open-source process, through asking questions such as:

  • Should we allow commit of blobs, even masquerading as test files?
  • Should we know the identity of the committer, and to what extent should this be assured/verified?
  • Should more be done to review any and all commits?
  • Are there meaningful red-flags we might create which would not be readily evaded?
  • How should individual projects be governed?
  • What comprises a critical project?

These are some of the questions on which I expect to see a lot of digital ink spilled over the coming weeks.


More thoughtful people may also ask why we include so many libraries in our most privileged processes and their attendant code, with questions such as:

  • Why do systemd and OpenSSH (ssd) need this library to be linked and what does that mean for our defenses?
  • What other important libraries may require close counter-intelligence to be performed so that we can reassure ourselves of their cleanliness?
  • How can we design systems more securely, trust more things, and need to trust fewer things?
  • How many other similar backdoors remain and have gone unnoticed thus far?
  • How else might we hunt for these?
  • What are the ramifications of such weaknesses for the state of our nation’s (and enterprises’) cybersecurity?

Again, all good, solid, and somewhat technically-focused questions, which speak to the problematic state of software assurance today.


The Human Element of Cybersecurity, and The Luck of the Draw


Then, there are humans.


As we rely ever more on various forms of machine learning and artificial intelligence, and as we continue down the long-traveled path of devolving and outsourcing our enterprise information technology to the lowest bidders, it is regrettably common that human expertise and intellectual curiosity have been dissipated among many in developer and programmer roles and, in some quarters, these qualities have become rarer than hen’s teeth.? You may wish to rail against this fact: it is not an especially palatable one.? To be sure, the truth is not always beautiful.


In years gone by, your best software engineers understood their code and the code they relied upon; they understood the hardware on which the code ran; and they likely understood (as they should) your business and how their code was able to benefit your top line.? And still they would question and challenge almost everything, seeking to understand things better, and asking thoughtful, open-ended questions, such as:

  • Why…?
  • How…?
  • What happens when…?
  • Why was this approach taken?
  • What if…?
  • Would it matter if…?
  • What would prevent…?


Whether because of too many concurrent tasks and foci at the individual and even team levels, or because of the prevalence of more specialized roles and less generalist approaches to education and training, or maybe even because of the diminishment of research in enterprises, it is today far more common to see personnel whose extent of questioning is limited to questions like:

  • Did it run?
  • Does it appear to be working?
  • Does it have any red lights turned on which we cannot ignore?


And then there are the few, like Andres Freund, the database developer who noticed something that the threat actor who wrote the liblzma exploit appears not to have thought about or worried about.? The threat actor — most likely a team of a country’s intelligence/hacking service — wrote their attack in a well-obfuscated but inefficient way, which introduced both a material delay (a few tenths of a second) and a material CPU impact upon each new ssh connection.? In many environments, this would almost certainly have gone unremarked and likely even unnoticed.? In a system which is exposed to the internet, where other computers are often trying (and failing) dozens of times every minute to connect to your system — to do things like “brute-force” their way into your machine by guessing a password — these individual impacts all added up.? Nobody complained.? Quite feasibly, nobody would even have noticed for quite some time.? But Andres did.? Not because he was paying attention to security, or to ssh connections, but because he was doing some performance engineering, some “benchmarking”, on his database engine.? And he looked, and saw an anomaly elsewhere in the operating system.? He recognized that it was unusual for sshd to be occupying the amount of CPU which it was.? Then, he used his lifetime of skills to look deeper, and even though he is not a “security” person, he knew enough to dig deeper; knew enough to read the source code of associated libraries and question what was going on; and ultimately, knew enough to ”call 911” and enlist others’ expertise in reviewing what he had found.


So what?? On the one hand, am I saying that we got lucky?? Yes, I think that is a fair statement.? On the other hand, am I saying that we can rely on more people like Andres as a means of defense?? Sometimes, yes, but as a general rule, certainly not.? It is easy to conceive of a similar attack going unnoticed, especially if it was made even more sophisticated or targeted more closely.


So, we got lucky, and beyond that, what are the take-aways here?


Fostering a Culture of Curiosity & Vigilance


People like Andres are the largely unsung heroes, proverbial canaries in a coal mine whose understanding and expertise of your IT systems which underpin your enterprise programs, coupled with their keen curiosity, can lead to results such as we have seen here.


Andres may not detect the next nation state attack against your industry.? Still, he, and many thousands of expert engineers like him, will continue to reap dividends for the continued functionality and security of your business systems.? He knew enough to recognize something as being out of place; he knew how and had the tenacity to investigate further; he understood when to seek and had the confidence to seek assistance; he did not assume it was someone else’s job — although his role is not “security”, he knew that security is everyone’s job; he felt comfortable that he would have the support of his leadership to spend time on this and to share his findings; and he may have just saved your company from an 8-K breach filing.


Generalist Expert: an adaptable and versatile individual with deep knowledge and skills across a wide range of areas, capable of solving complex problems by integrating diverse perspectives and disciplines

Here, then, are some less technical questions which you might want to ask:

  • How many generalist experts do you have in your IT team, and how do they feel about their jobs?
  • Do your IT staff know that security is part of their job?
  • Does your tone from the top encourage and promote intellectual curiosity, reporting and questioning of systematic anomalies (even unrelated to primary discipline), and investigation of root causes?
  • How are you encouraging and promoting generalist and cross-functional experts in your company?
  • Are you supportive of information sharing and vulnerability disclosure programs?
  • How well do you and your team understand what your systems do and how?
  • Would your team know if fine-grained metrics and behaviors of your systems changed, and if they did, when this should warrant investigation?
  • Are your policies and corporate culture attracting more or fewer generalist experts? How should you change this?


The good news is that we still have lots of folks like Andres looking at many of the common building blocks of IT.? Of course, as your business processes and use cases get increasingly specific and narrow, many fewer of these generalist experts remain, who have current knowledge and awareness and responsibility for those systems.? This piece of malicious code was designed to check some settings to avoid being caught when it runs in certain sandbox and debugging environments, but at the end of the day it still set out to run on essentially all computers.? If it had been targeted in a more granular fashion, it would not have been up to the set of all Andres, and it might well have been down only to the ones in your enterprise.

You may not catch the next supply chain injection before it is used against you, even with a decent complement of curious and vigilant experts.? If not, you will want them around all the more when the consequential risks are realized: after you are breached, such a diversity of skills and thought processes are all the more important in obtaining a rapid recovery.


Make your own luck.? Make sure you build your team with enough Andres.


Andrea Walsh

Wife, Mom, Analyst

11 个月

“Corporations shove this code together, without fully understanding why or what they are doing, and are delighted if and when everything works more or less as they fancied it might.” This is why I enjoyed working with you and reading your work now. Reality may not be pretty, but this is what I see so very often. Thanks for the insights!

回复
Shravan Kumar Chitimilla

Information Technology Manager | I help Client's Solve Their Problems & Save $$$$ by Providing Solutions Through Technology & Automation.

12 个月

Such an insightful analysis! Definitely emphasizes the critical role of human factors in cybersecurity. ?? Donald McFarlane, NACD.DC

回复

Great insights on the importance of corporate culture and generalist experts in IT teams! ??

回复

要查看或添加评论,请登录

Donald McFarlane, NACD.DC的更多文章

社区洞察

其他会员也浏览了