登录查看更多内容

Demystifying PaaS security (part 1)

Christophe Parisel

Senior Cloud security architect at Société Générale

发布日期: 2018年10月22日

As a sequel to this article, today I propose to engage in a series of architecture security articles about PaaS in Public Clouds; compute PaaS specifically. The security of compute PaaS is not well documented, so many questions are asked and concerns are raised by the IT security community. Key foundations are simple and can be explained without breaching any non-disclosure agreement, though.

This first installment will be looking at tenant isolation.

Feeling at home within shared compute services

At a first glance, the problem looks simple enough: you want to isolate each tenant binaries (or code, if interpreted) into separate VMs. To put it another way: you let the host (hypervisor) do all the fencing around guests (customer VMs) executables and I/Os.

Even if it seems the most straightforward way to fall back to a known risk management situation (we’ve all been doing this stuff on premises for over a decade now), it’s worth noticing that up until recently, not all mainstream public cloud providers have followed this path. But so far, and to the best of my knowledge, alternate ways have fallen short and VM isolation is about to become the de facto standard.

So let’s take this for granted. Before seeing the engineering issues and caveats it implies, we should first consider the kind of compute capability that are delivered on top of guests. For best portability and execution times, containers are the obvious answer. The fact that containers (or pools of containers, all single tenant) are started in dedicated guests kind of defeats the prospect of getting lightning fast execution times, but here Cloud providers leverage the full power of their workhorse backends to pre-provision as many (or as few) instances as necessary to render good customer experience at affordable price.

A side note: in the Windows version of Docker, Microsoft came up with an elegant solution to wrap all this up into a single concept: running containers in hyperv isolation. If hyperV is enabled on your PC or if you rent a VM in Cloud with nested hypervisor support, you can easily test it by yourself.

Now let’s see how well this plays out in practice for various customer compute needs:

? Event driven and/or single compute task: if you read my mind, then yes you will know that I think of the wide and fuzzy range of autonomous tasks from functions to stateless long running jobs. Here the guest & container recipe fits pretty nicely because it is easy to implement at scale and is easy to schedule/monitor. Incidentally, that’s the family of compute services that’s been the first available on the PaaS marketplace, just after guests only recipes like Beanstalk. We will see in a further installment that other big factors come into play to explain this timeline, however.

? Microservice: with a few exceptions that usually bring maybe 10% of the value of an API, even the most basic microservice is made of several compute and stateful components that need to communicate through some network layer. Here the simple model described above starts to fall apart: to sustain microservices, Cloud providers need some kind of single-tenant containers machinery able to manage topology, communication and state awareness.

? Application: providers need to manage containers in a mono tenant way as above, but with much more components layered into tiers. Also add to the fact that not all application parts scale at the same pace and with the same service level contracts.

Exactly like containers have not been designed to run in multi-tenant guests, off the self container managers like Swarm or Kubernetes have not been designed to run in multi-tenant public platforms.

In Swarm or Kubernetes, there’s an insuperable deal of entanglement between execution nodes and master nodes and/or a dangerous proximity between controller containers and tenant containers. Customers wishing to run full fledged APIs or applications in public Clouds PaaS have long had to resort to clusters run in dedicated guests, a less than ideal situation when one comes with cloud native experience in mind. That’s one of the key reasons why AKS, ACS, ECS and EKS have been available on the PaaS marketplace before more cloud native compute offers.

Solution design for secure serverless computing

Let's see how Cloud providers have overcome this for serverless.

Amazon’s way: what is fascinating about AWS engineers is their almost supernatural gift for making right design choices from start. I suspect Werner Vogels has a lot to do about it, although I cannot prove it... I belonged to the previewers of ECS when it came out not so long ago. At that time, I must admit that I didn’t like the way Amazon did the scheduling of my containers. I found the ECS admission control based on AWS-provided jobs and state machines both cumbersome and not cloud friendly. But this is the direct consequence of the fact that in ECS, execution nodes and controller nodes are clearly separated; of course there is a tight relation between a set of execution nodes and the set of its controller nodes, that's maybe why it has taken time to establish a secure delineation between both sets in Fargate. What I considered as an awkward design pattern turned out to pave the way for an actually efficient serverless containers management in AWS!
Azure’s ways: as often in Microsoft’s post Ballmer universe, there is not one but several ways to explore and reach given goals. The ways are not competing against each other, they rather stimulate themselves like twin R&D projects. The management of compute services is no exception: if you are an Azure customer, you may use app services or service fabric clusters (I leave grid computing and batches aside). From a bird eye’s view, both offer more or less the same set of compute capabilities: functions, containers, binaries. Stateful or stateless. The difference gets even thinner as time goes by. But for serverless computing, as of October 2018, only one seems to stand out: service fabric mesh. This service takes the best of two things: the strong and ever growing management capabilities of service fabric on the one hand, VM isolation on the other hand. My guess is that the operating system behind Service Fabric Mesh is Windows and that containers are isolated in hyper-v mode, but the actual implementation could be very different. Since Service Fabric supports Windows and Linux alike (another post Ballmer consequence), it might make more sense for Azure to keep operational excellence by running a mix of both flavors as part of the global Service Fabric Mesh offer? Time will tell.

By making their own container management solution, Cloud providers have been able to decouple multi tenant activities (all concentrated into provider-managed controllers) from mono tenant ones. Customer workloads are isolated into guests thereby reducing the co-residency risk to a VM escape risk. The residual risk is not voided but standardized, it's up to each customer to either accept it or find additional risk-reduction measures.

In a next installment, we will look at other engineering challenges providers have had to meet in order to provide secure compute PaaS, some of which still unsolved.

Christophe Parisel

Senior Cloud security architect at Société Générale

6 年

Bastien and Fouzi: thanks for your support, I appreciate! The next article is in the pipe...

1 次回应

Fouzi BOUKEZZOULA

Tech Lead Public Cloud @Société Générale

6 年

Great article... looking forward the next chapter.

Bastien Gras

Group Offensive Security Officer

6 年

It's always interesting Christophe, thanks for taking the time to write.

查看更多评论

要查看或添加评论，请登录

Christophe Parisel的更多文章

Adversarial lateral motion in Azure PaaS: are we prepared?

2025年3月10日

Adversarial lateral motion in Azure PaaS: are we prepared?

Lateral motion techniques are evolving in PaaS, and we should be worried. Let's discuss a risk confinement approach.

19 条评论
How will Microsoft Majorana quantum chip ??compute??, exactly?

2025年2月27日

How will Microsoft Majorana quantum chip ??compute??, exactly?

During the 2020 COVID lockdown, I investigated braid theory in the hope it would help me on some research I was…

16 条评论
Zero-shot attack against multimodal AI (Part 2)

2025年2月3日

Zero-shot attack against multimodal AI (Part 2)

In part 1, I showcased how AI applications could be affected by a new kind of AI-driven attack: Mystic Square. In the…

6 条评论
Zero-shot attack against multimodal AI (Part 1)

2025年1月20日

Zero-shot attack against multimodal AI (Part 1)

The arrow is on fire, ready to strike its target from two miles away..

11 条评论
2015-2025: a decade of preventive Cloud security!

2025年1月6日

2015-2025: a decade of preventive Cloud security!

Since its birth in 2015, preventive Cloud security has proven a formidable achievement. By raising the security bar of…

11 条评论
Exploiting Azure AI DocIntel for ID spoofing

2024年12月16日

Exploiting Azure AI DocIntel for ID spoofing

Sensitive transactions execution often requires to show proofs of ID and proofs of ownership: this requirements is…

10 条评论
How I trained an AI model for nefarious purposes!

2024年12月9日

How I trained an AI model for nefarious purposes!

The previous episode prepared ground for today’s task: we walked through the foundations of AI curiosity. As we've…

19 条评论
AI curiosity

2024年11月26日

AI curiosity

The incuriosity of genAI is an understatement. When chatGPT became popular in early 2023, it was even more striking…

3 条评论
The nested cloud

2024年11月13日

The nested cloud

Now is the perfect time to approach Cloud security through the interplay between data planes and control planes—a…

8 条评论
Overcoming the security challenge of Text-To-Action

2024年10月15日

Overcoming the security challenge of Text-To-Action

LLM's Text-To-Action (T2A) is one of the most anticipated features of 2025: it is expected to unleash a new cycle of…

19 条评论

See all articles

Demystifying PaaS security (part 1)

Christophe Parisel

Senior Cloud security architect at Société Générale

Feeling at home within shared compute services

Solution design for secure serverless computing

Christophe Parisel的更多文章

社区洞察

其他会员也浏览了

Lost in the Cloud ??? The Answer to Every AWS Networking Issue Starts with VPC??

Architecting the Cloud: Decoding AWS VPC and Its Core Components - Part 1

Working with Amazon VPC Network Access Analyzer

Enabling VPC Flow Logs for All Subnets within a VPC Using Terraform

Cloud Network Security: Regain Cost & Ops Control

AWS –– Virtual Private Cloud aka VPC

Azure AKS Baseline Security with minimal changes

MY JOURNEY THROUGH NEXASCALE CLOUD SECURITY PATHWAY.

Architecting Secure Multi-Cloud Connectivity with VyOS: A Simple Step by Step Use Case

Insights on Leveraging Cloud Entities for Scalable and Secure Solutions

Feeling at home within shared compute services

Solution design for secure serverless computing

Christophe Parisel的更多文章

Adversarial lateral motion in Azure PaaS: are we prepared?

How will Microsoft Majorana quantum chip ??compute??, exactly?

Zero-shot attack against multimodal AI (Part 2)

Zero-shot attack against multimodal AI (Part 1)

2015-2025: a decade of preventive Cloud security!

Exploiting Azure AI DocIntel for ID spoofing

How I trained an AI model for nefarious purposes!

AI curiosity

The nested cloud

Overcoming the security challenge of Text-To-Action

社区洞察

其他会员也浏览了

Lost in the Cloud ??? The Answer to Every AWS Networking Issue Starts with VPC??

Architecting the Cloud: Decoding AWS VPC and Its Core Components - Part 1

Working with Amazon VPC Network Access Analyzer

Enabling VPC Flow Logs for All Subnets within a VPC Using Terraform

Cloud Network Security: Regain Cost & Ops Control

AWS –– Virtual Private Cloud aka VPC

Azure AKS Baseline Security with minimal changes

MY JOURNEY THROUGH NEXASCALE CLOUD SECURITY PATHWAY.

Architecting Secure Multi-Cloud Connectivity with VyOS: A Simple Step by Step Use Case

Insights on Leveraging Cloud Entities for Scalable and Secure Solutions