Looking for privacy in AWS virtual private clouds
Is that me, or AWS announcement of VPC routing to an EC2 instance's ENI this week at reInvent feels lackluster? I'm not talking about the value of the ingress capability (which is also arguable, as a pattern), but of the egress one.
In several articles (latest here), I have reported the critical lack of a decent, native way to prevent data exfiltration from AWS compute and middleware PaaS.
Cloud providers now excel at ingress protection, endpoint policies and storage encryption, but Privacy does not equate to that: an egress transparent proxy (like Azure Firewall) dots the i's and crosses the t's.
When we identified the caveat back in early 2015 -immediately after we kicked off our Cloud transformation- we at Société Générale CIB were pretty much stranded with a single option to cover the insider threat in Azure and AWS: we had to build and manage our own transparent proxy service.
So, we made up a prototype, let me call her "Shirley".
As the single member of her species, Shirley became extinct before her first anniversary but she proved instrumental during the critical phases of risk assessment and Cloud security doctrine articulation. We learnt quite a bit along the way! Let me share a few takeaways we came up with. When you look back, some if not most of them might seem obvious now:
- Proxy performance can scale linearly with throughput even for the most demanding data transfers as long as you choose the right technology
- Level 7 (not less) inspection is required for virtual host endpoints
- A IaaS transparent proxy will fail to deliver at unpredictable intervals since it depends on PaaS DNS updates (both universes are loosely coupled, so service discovery is clumsy at best)
- A IaaS proxy cannot interface with most PaaS offerings
- Assets whitelisting in a Cloud proxy is much more simple than on premises as long as you admit Cloud users do not have to browse the Internet except from their workstations
- Service endpoints policies and/or private endpoints are way more efficient than proxies for PaaS-to-PaaS data leakage protection (note: such policies didn't exist in 2015...), but proxies are necessary for Paas-to-Internet protection.
- A IaaS proxy will always cost you more than PaaS. (This can be said of many services of course).
After some digging into a pile of unsorted documents, I managed to recover a simple early sketch of Shirley that I thought was lost. It's not very useful anymore except for historical records I guess, but here it goes.
Above: Shirley circa spring 2015. HA proxies where grouped in two Autoscaling Groups (Inner and Outer) and configured in "inverted mode", ie they where used as proxies instead of reverse proxies. Scalability went smoothly due to 1) highly optimized processing 2) useful EC2 metrics and horizontal scaling capabilities
This story is an opportunity for me to stress out the importance of experimenting. I believe technical architects should spend a sizable amount of their time doing just that! Not as part of technology watch, but as an integral part of their field activities. Experimenting means knowing what you do and where you lead the pack. It also involves copious amounts of serendipity, a prime fuel for IT architects...
Aside from the serviceable Shirley, we spared no effort in workshops with our Cloud providers to have a native PaaS transparent proxy offer added to their catalogs. We were very excited when Azure was the first to deliver 1.5 years ago (with the release of Azure Firewall), because our teams took an active part into both private and public previews of Azure Firewall; surely we will not miss a chance to preview a similar service from AWS if such a thing is brought on their agenda. If you are a privacy champion, neither should you.
The quest for privacy in AWS is definitively not over.