Take Your Destiny Into Your Own Hands
As various wise people (and dumb people quoting wise people in an attempt to seem wise themselves) have said:
Control your destiny, or someone else will.
The vast majority of a software product isn't written by the creators of that product. The product is actually built upon a foundation of open-source operating systems, libraries, services and applications, with a bit of "special sauce" to turn that heap of building blocks into the ready-to-sell product. You, the builders of those products, depend upon repositories of these bits and pieces to get your own product out there. What could possibly go wrong with constantly integrating software you didn't write and haven't tested for yourself into your own product?
A quick story. Once upon a time there was a network security product, sold as a PC-based hardware appliance, that stored its platform configuration in a Postgres database... all the good stuff like performance tuning, product configuration, well-encrypted usernames and passwords, and the network configuration. Upon booting, it would unpack those config items from the database and write config files, etc. used by the product as needed. It was Ubuntu based, and that product happily pulled current releases of all of its dependencies, keeping them patched nice and current. All was fine until one day Postgres updated from 8.0 to 8.1. That security appliance dutifully pulled the new version of the database server software and...
Upon next reboot the appliance went dead to the world. For an obscure schema compatibility reason (today's vocabulary word is "deprecated"), the platform configuration database couldn't be read any more at bootup time. This included the network configuration, meaning the appliance dropped like a brick from the network. No IP address for you! No local non-root console user logins locally for you, either! It's dead, Jim.
We'll tell you how that story ended at the end of this article. Guess you'll just have to read it all (see how sneaky I'm being)? But in the meantime...
What can you do in the software equivalent of Murder on the Orient Express?
First Suspect: Broken Patches
The first problematic behavior is blindly pulling patches. Sure, it's generally a great habit to run that dnf -y update every day in your root crontab, but that has its own risks. The chief of these, of course, is that your software will break. A change in a dependency, a bug introduced into a library, a deprecated API call that's finally been removed that snuck up on you will blow up your own software. Though this is rare, when it happens you're in RevertLand. It's manageable on your desktop, a problem on your servers, but a disaster if your product includes firmware or is some sort of network appliance built onto a Linux or RTOS (runtime operating system). If your customers have to revert, they are not going to be happy, particularly because they'll be giving up other security patches, bug fixes and new features just to accommodate that one problem-child patch, not to mention their downtime and potential lost data if they depend upon your product.
Don't have your product blindly download whatever happens to be in a major repository, especially on a constant basis.
Second Suspect: The Dependency-Provider Turns Evil
The Red Hat/CentOS debacle in which Red Hat effectively hamstrung CentOS to force people to buy Red Hat licenses is still causing major headaches as CentOS version support is now measured in months not years before a given release is abandoned. ElasticSearch also pulled a Darth Vader-esque "we have changed the license, pray we don't change it again" shenanigan which didn't earn them any love and left a lot of people who depended upon them out in the cold. Practices like this leave a huge number of servers that had been expected to receive patches for several years suddenly unsupported.
Not cool.
Third Suspect: Rotten Repositories
It's become depressingly more and more frequent: news of a major operating system, container or source code repository becoming compromised. The software that backs your product, the libraries, the services you rely upon is found to have had back doors and exploits deliberately introduced by someone who seized control of the repository. You've been hijacked.
Worst of all, once in a while the source of the compromise is actually a maintainer of the code or repository itself. Someone new might take over maintenance of a widely-used package, or the original author might decide that what you really need is better "remote administration software" (would that even count as a "hack"?).
Fourth Suspect: Where's Waldo?
And sometimes the authors of an open source package get mad at... something... and take their ball and go home. Or some patent troll gets a tingle in their evil little brain that says "hold the world hostage, I need a new Ferrari". The repository just plain disappears, and now you're hunting down copies of the packages you need and downloading them from a site that may or may not care about keeping out the bad guys.
领英推荐
Temper tantrums and legalist tapeworms have a long reach over the Internet.
What to Do, What to Do
As the title says, take your destiny into your own hands.
Your product shouldn't be pulling software patches, etc. from servers you don't control. Run your own update server. The bits on that server need to be what you have tested, what you have trusted. Every bit of software that runs on your product needs to go through your own hands.
First, if your product includes the underlying operating system, fork the repository for that O/S into a repository you run yourself, and configure your product to pull patches and updates from that repository only. This includes the O/S, the libraries, the services and applications your product depends upon (where possible). Your own repository should be the sole source of truth for 100% of the software used by your product (or as close to 100% as you can manage).
Also ensure that your repository affirmatively identifies itself. Have SHA256 checksums of your downloads available on a different site; that way the bad guys have to seize control of two separate servers to get away with it. Announce the availability of patches on one authoritative place like your website. Let people know how to verify that "source of truth" is really you in your product documentation. SSL certificates and pre-shared SSH keys/fingerprints are a start. No plain HTTP or FTP for you, and if scp -o "StrictHostKeyChecking=no" is anywhere in your patch process, report to the brig for the disciplinary process.
And while you're at it, since you're almost certainly using a DNS name to identify that repository, armor the heck out of your DNS records. DNS hijacking is a great way to redirect people from a legitimate server to one loaded with malware, credential-stealing and offensive content.
Second, test, validate and approve any critical dependencies before pushing them to your repository. The aforementioned Postgres version problem would not have been a problem if the updates were being pulled from a company-controlled repository and those updates were tested prior to being pushed up to that repository. This means that periodic validation of patches is part of your software lifecycle, and that you need to monitor the origin of those patches for news of critical changes such as vulnerability remediation so you can get an approved, tested new version onto your repository quickly. That's just the cost of doing business; you'll need to identify someone in DevOps and another someone in QA to have this process as part of their duties, and you'll have to include the time necessary for those people to do this work as part of their overall schedule. This isn't a small one-hour-a-week thing; it's not an "additional" duty that you can just dump into someone's workload without relieving them of some other time-gobbling burden.
Third, scan your known dependencies for vulnerabilities that will prompt you to test and patch. This is actually one of the easiest things you can do, as there are many services (such as GitHub's Dependabot) that can automatically notify you when something needs a swift kick in the source.
Fourth, have a strategy for reverting in case a patch goes wrong. If something breaks, no data should be lost from the period leading up to the instant of the break. Too often products are forward-only. Your patch process needs to be able to recover from a broken upgrade.
And last, communicate to your customers that this is how you go about the business of patching. Include it in your documentation, have your pre-sales engineers mention it, and be ready to help your customers (or your own DevOps people if you offer hosted services) set up outgoing firewall rules such that your product can only patch from your servers.
Whodunnit?
The overworked junior IT engineer, in the server room, with the RJ45 crimping tool. It was brutal.
And now the promised conclusion of the story.
The product in question had a deeply buried QA tool that could lock and revert specific patches of specific services. That QA tool was reachable from the local console through the root login (each of which was unique, associated to the serial number of the appliance, stored well-encrypted with a two-part key that required big mungo supervisor approval to get those keys to decrypt) and could also be triggered through the update process for the product software itself. Customers that hadn't rebooted since patching had that QA tool tickled as part of the next update check, locking Postgres to 8.0 until we got our act together. Customers whose appliances had not been seen patching were contacted and given instructions on what to do. Only one customer was "bricked", and there was no data loss or compromise. They were also understanding (and made a few changes in their own internal processes for patching their own servers in response).
We got really, really lucky.
Do YOU feel lucky? Well, do ya?
Have some lucky hashtags. #patching #repository #updates #didyoupatchfivetimesorsix #maytheforcebewithyou