How we release Windows updates
Where do you see the work we do? When Windows updates get released to the world! For many of you, that’s a single device, or perhaps each of your work and home PCs. For us, that’s millions or even billions of devices, including virtual machines. And for that, we manage hundreds of different releases simultaneously every month. You can think of each of those releases as an individual factory, preparing and releasing a set of updates for devices around the world. The illustration below shows the number of new requests we used to receive each month. These numbers have doubled now.
?
If that doesn’t look like a lot at first glance, let’s dive into the complexity of updates we publish, and the scale at which they’re delivered. For illustrative purposes, we’ll walk through three of the many release readiness tools we’ve built and our robust evidence-based approach to understanding when releases are ready to be published, throttled, or accelerated.
The release orchestrator
With a charter to service all Windows products across all product lines – desktop, IoT, HoloLens, server, cloud, etc. – Windows Servicing and Delivery (WSD) needs automation that can keep all of these complex trains running together. And apart from the Windows updates, we also deliver .NET and Microsoft Edge updates to these product lines. That’s over 1,400 unique release types! We've built our release orchestrator to manage update releases in a touchless fashion.
It brings together several dedicated teams to manage the comprehensive rules, processes, and tools that make a release happen. There’s also a sequence of predetermined steps for data structures and their component parts:
So, view the orchestrator as a sort of a control plane for all these individual services and processes. Shashank Gupta , software engineering manager, explains the scale of it:
“Curating and delivering updates to billions requires a deliberate effort to invest in systems and services to achieve highly flexible capabilities, economy of scale, and a pipeline that is secure by default. To fully grasp the magnitude of our scale and reach, consider the complex problem of ensuring every update reaches every machine that it is applicable to – with precision and a margin of error that is 0. We make it happen.”
The sequence, which often represents a suggested automated process flow, can be modified and/or overridden at run time, depending on quality signals and business imperatives. Three factors make it especially challenging:
Regardless of challenges, we repeat these efforts for every one of your security and non-security updates monthly.?
Why is this so important? While a release can be represented as a broad set of activities we perform, our internal orchestration engine accelerates that end-to-end process and ultimately the time necessary for us to get a release into your hands. Shashank Gupta highlights:
“Our release service orchestrator is coordinating approximately half a million activities in a year, which in yesteryear would have been done by human input and decisions. We service over 14 million requests per month through our internal APIs to keep the machinery running at clockwork cadence.”
Let’s track it
Some of our brightest colleagues also built a suite of visualization tools to support every Windows release. Our release tracking tool is the management plane that sits right above and along the orchestration for a more front-end, UI-view of the factory floor individual services. It's used by engineers across multiple roles: from teams developing and releasing features through the release managers who ensure Patch Tuesday happens every month without fail. They track the progress of release payloads through various phases of the pipeline and ensure we’ve completed every step needed for that update to be ready to go.
Our release tracking system provides a detailed view of the content that gets published as part of any Windows release, along with its current state. It visualizes the end-to-end workflow for the servicing releases and reflects the status of the payload in each phase of the pipeline. That’s one centralized place for all details, controls, and automations that run for update payload, managed by our colleague Aaron Voros .
领英推荐
Every release includes multiple validation steps, all the changes that are added to the payload, the different systems, and stakeholders involved. Even on a small scale, if it were managed manually, it would be cumbersome and prone to human error. At a global scale that we operate, we agree with our friend Aaron: clear, data-driven decision support is vital. We know it because our tracking tool has come a long way since its inception, and we continue to build in automation and expand its controls.
Why is visualizing this complexity so important? Well, this data-driven dashboard approach integrates validation signals from Insiders, internal self-host, diagnostic data, and other sources so that we know that our stuff is ready for release. That’s what we call quality signals that we monitor with our tracking release readiness dashboards. Visual UI helps us to quickly pinpoint risks that require attention and bring cohesion across dozens of systems. We’ve built this suite of tools to allow our teams to incorporate our learnings from the past into improved payload, validation, release, and post-release steps of every update.
Senior Product Manager, Corrine H. says:
“We take all this information and give a single snapshot for the release that says, ‘yes, this release looks great, everything’s green.’ And if it’s not, then we can block engineering managers and directors to sign off until further investigation or resolution is completed.”
The role of our publishing system
Once “everything’s green,” we’re ready to release the update! Our Update Publishing Services (UPS) Team created a central publishing tool for any content released to Windows. The mission of the?tool is to effectively deliver Windows?updates to the right devices at the right time. This endeavor goes beyond optimizing the content of the update and is led by Principal Manager Product Management Aarthi Thangamani and her Update Publishing Services (UPS) team.
It's just so critical to get this right—to make sure you get the right updates, keep your device(s) responsive, and responsibly manage network and energy costs. If we don’t, we might not be efficient at helping you prevent serious problems: slowing down your computer, flooding your network, or not providing fixes quickly enough to help protect you from security threats.
So how does our publishing tool help ensure that you get the fixes and features you need? Let alone, how does it keep track of everyone’s specific device, with its specific configuration, out of billions and billions of active monthly devices across the world? Aarthi explains:
“We ensure that there is technology that prevents over-offering, that detects if such an issue happens. We have controls in place to mitigate those situations by throttling or expiring the update, so that users don't receive this update anymore. And the orchestrator in a way helps us through these several situations.”
All of that before you see that blue or orange dot, signaling you have updates waiting for you!
Final musings
We’ve built these and many other innovative pipeline technologies because they are needed for efficiency, security, and scale. For example, we’ve developed “helper” technologies to become the most efficient at unified communication, building, and monitoring the payload, the pipeline, and dependencies. All of that to provide the right offerings to your devices faster than ever. We obsess about doing better, going faster, and helping you get updates as safely and reliably as possible.
Our teams are continuously innovating and pushing the boundaries. These tools help us power new scenarios for the Windows product and beyond – including important products like .NET, Windows Defender, Edge, and Azure. They also power continuous innovation in Windows 11, expansion of the new Bing preview experience to Bing and Microsoft Edge Mobile apps products for your greater security, productivity, growth, and satisfaction. ?
?
Principal Group Software Engineering Manager at Microsoft
1 年Always nice to see the innards of a workflow ??. Keep up the good work!