Cloud confidential computing in 2021
Microsoft, Amazon and Google are all beefing up their Confidential Computing offerings. Are they equivalent in terms of risk coverage? Are they safe to use?
Before delving into the specifics of Confidential Computing, let's figure out what benefit it attempts to bring as opposed to Vanilla Computing.
Although data protection has many other aspects and side-effects, here our concern is preventing read access to the data, no matter their state (in transit, in memory, at rest) and regardless of the health of their environment (compromised keys, critical vulnerabilities, unmonitored endpoints, pass-thru firewalls...)
If you are familiar with the handling of highly sensitive data like personally identifiable information, one can assume you're already guarding yourself on premises against most of these threats.
Now in the Public Cloud, Vanilla Computing is relatively helpless at providing decent visibility on rogue Cloud provider administrator or an adverse co-resident (operating from a malevolent or a compromised third party account) activity being carried out on your data.
The first serious attempt to cover these risks was announced in late 2017 by Mark Russinovich, Azure CTO: it was an early access service then called Azure Confidential Computing, which have set the pace (and the name) for the whole Cloud industry. In a nutshell, what it did and still does is leverage Intel enclaves to process customer data in an isolated and encrypted area of an Azure CPU. This area is managed by a new set of instruction codes called SGX. A crucial factor to be aware of is that the Cloud provider (here: Microsoft) has not access to the encryption keys.
Since them, Amazon and Google joined the rank by leveraging Nitro and AMD enclaves respectively.
Impact of Confidential Computing on the Cloud shared responsibility model
The model that we all know defines two parties: you and your provider. Both have dev and ops responsibilities.
With Confidential Computing, there comes a new party: the founder. By nature, it has no ops capabilities, so "someone" has to fill in for her. Mainly, foundry-related ops cover the set up and continuous monitoring of trustworthiness of the enclaves and a few other seemingly menial but actually important tasks like firmware maintenance.
Zeroing in on this particular unchartered area of the responsibility model, we quickly realize that only the customer is legitimate to take on foundry-related ops.
Notes:
- founder is just a role, it might actually be fabless;
- front-end connectivity to the enclave is offered by the provider as part of control plane ops (this is not depicted in the diagram above as it does not break the traditional model);
- as the title suggests, oversimplification entails inaccuracies: you might argue that your version of the responsibility model is slightly different than mine, and you would be right!
The real game changers are depicted in black and red: "Founder" is a new actor in the model (black), and "To Be Defined!" (red) is a new responsibility that must be taken by the customer by default, since nobody else may fill-in for her.
What are the limitations of Confidential Computing?
As usual... There are still residual risks on your data
On the paper, the enclave technology seems to protect against read access from disgruntled Cloud administrators or adverse co-residents but there are a few attention points to bear in mind:
- as discussed in the section about the shared responsibility model, during an enclave configuration stage the key generation and signing process must be held entirely out of rogue provider admin/co-resident prying eyes. This is currently a challenge since this stage is currently performed in the unprotected space of the service (eg: VM) itself.
- a Cloud provider employee may engage in collusive hacking with some Intel/AMD/Nitro expert to get the enclave keys: the scenario is far fetched for Azure and Google, but not so unlikely for Amazon since Nitro enclaves are built by Amazon;
- the underlying software and hardware technology is complex, mission-critical and quite new. To handle highly sensitive data, you have to have very good reasons. And if you have very good reasons, you have to have very high expectations. If you have very high expectations, you need strong historical records backed with a large users base otherwise you're going to have a hard time proving that co-residents and Cloud provider administrators are properly isolated from your confidential space
- some tools may be used by a privileged Cloud employee to attempt tampering with the hardware (physical breaches should trigger some fuse, and the critical parts of the microprocessor should self-destruct or be reliably auto-wiped. Should. Hopefully...)
Limitations from native-ness (or lack thereof)
The range of use cases you may currently run into Cloud enclaves is currently quite limited. With the recent announcement of Confidential Computing for Azure Kubernetes, it looks like this range has widened and is not more a IaaS monopoly. However, enclave secrets are managed by Microsoft, this contradicts the principles set out in our shared responsibility model.
Beyond pure computing, if you consider making full use of other PaaS backends (eg: middleware) then as of today since your stateless runtimes need to get data from your (usually) stateful services, you need to make encrypted queries to PaaS backends.
In practice, most of the time you will need to load the backend into enclave memory to process a given query. You cannot just pass on the encrypted query straight to the stateful engine. As long as homomorphic encryption has not become widely accepted, loading the whole data structure is probably the only option currently at hand. It brings some big constraints on your side:
? if you rely on a third-party solution (a database, a memory cache,…) to maintain state, you must ensure that (a) the solution can be loaded into memory, (b) it supports the limited subset of hardware instructions has to offer;
? if you use a Cloud native solution, the assumption that the Cloud provider has no read access to your data becomes quite weak. As usual in the Public cloud, things are evolving at frantic speed and in the right direction! Two months ago, Azure announced the general availability of Confidential Computing for Azure SQL. This leverages enclaves to perform in-place encryption (i.e SQL computations within an enclave). Still... enclave secrets are managed by Microsoft.
? if you use an in-house custom solution, you can say goodbye to your hopes of being native and your return on investment will certainly take a hit.
Tentative Conclusion
Azure had a head start in IaaS and is still making the major breakthroughs required for making Confidential Computing a reality in PaaS. AWS and Google are not far behind and will catch up at some point.
But I think there is an important obstacle to overcome in both IaaS and PaaS: how to let customers manage the enclaves of their compute capabilities securely from end to end?
Customers need to have an independent access to the enclave for configuring secrets. It must not be susceptible to man-in-the-middle.
Just yesterday (on March 15), Mark Russinovitch announced a new landmark in Azure IaaS leveraging the latest generation of AMD EPYC processors. Sounds like a broadening of their hardware offering more than an actual improvement on the caveats discussed in this article.
A second obstacle is related to Amazon posture on Confidential Computing: they departed from Google and Microsoft by manufacturing their own semiconductors. If this is good for performance optimization and costs reduction, one should be cautious about the consequences of this decision for the regulated industries and more generally for organizations delivering cutting edge business value.
Note: this article is an overhaul of the initial assessment I wrote a couple of years ago on the matter.
Senior Cloud Architect / PO
2 年Very interesting ??
Christophe, may I suggest to also consider Hyper Protect services available in IBM Cloud and used in for IBM Cloud for Financial Services. They rely on IBM Secure Execution tech (s390x proc). No security breach has been published yet. No SDK required to use the enclave. IBM has also a technology on its POWER processor (IBM Protected Execution Facility) but no service made available yet.
Head of Platform at Secretarium & Klave
3 年Very interesting article! And definitely something to share with expert in that domain Secretarium.
Senior Director Engineering at Kyriba
3 年Thank you for this article, as always brillant Christophe ! I have a few comments / questions though as you can imagine ! :) I'd like to understand your point on those AMD Epyc processors being a technical detail, it is far from being the case. Intel enclaves are to be setup at the Application level whereas AMD Epyc has a full VM encryption at the CPU Virtualisation level. One has impacts on the codebase to be changed and adapated to leverage those enclaves the other is seamless from an App Perspective. Your second point I don't quite get is around PaaS encrypted queries to stateful services, let's imagine a managed DB service running on Epyc processors, a client pops a database through the control place, it is being provisioned on a VM encrypted by the AMD processor and dedicated to the client. What is the issue ? To sum up my perspective and what GCP has today: - All Data at rest is encrypted and KeyStore management can be delegated to another company while being compatible with managed services - Data in transit is not an issue - Confidential computing brings another layer of confidentiality to IaaS and more and more PaaS services I do agree on the homomorphic encryption being another interesting path but still, I think you're a bit hard on the value AMD brings in :) Thanks for the time and effort you put in those articles !
????Follow me for Key Management, HSM, PKI, PQC & all things crypto.
3 年Take a look at Fortanix #ConfidentialComputingManager for securely orchestrating the deployment of applications within secure enclaves (including the one-click conversion of existing applications to enable lift-and-shift): https://azuremarketplace.microsoft.com/en-us/marketplace/apps/fortanix.enclave_manager?tab=overview