Azure RBAC refactored
The Azure role model so deeply rooted into Azure Active Directory is a marvel of scalability and consistency that can easily handle the lifecycle and distribution of hundreds of thousand of objects without protesting.
It is an arguably complex system because, as a core Cloud component, Azure Active Directory (AAD) is heavily performance driven. It has to meet countless internal SLAs as well as an occasional share of conflicting objectives. A possible explanation sustaining these conflicting objectives is that the interplay between AAD and Azure Resource Manager (ARM) is much more subtle than what we could imagine: under the common brand "Azure" lie two separate operating models, and often also two separate philosophies which have to work in concert to offer that unified experience to Azure customers and Azure product teams.
Inevitably with this kind of dual setup, some real-life IAM customer objectives cannot be fully met in terms of performance and scope.
Because of the organizational gap between AAD and Azure universes, Cloud customers are being bereft of an RBAC abstraction layer that could be functionally comparable to Azure Resource Graph (ARG)
The rough equivalent of ARG in AAD is the Microsoft Graph API. As an Active-Directory + Office 365 centric tool, it offers some very valuable capabilities to poll AAD's own RBAC, but not to poll Azure RBAC itself.
Why?
The Microsoft Graph API is embedded into the Azure management API. We have to resort to it whenever we interact with Azure IAM objects. The Graph API is strongly aligned with the complex internal Azure RBAC model:
Such rigid mapping makes it difficult to query IAM information which are meaningful for some important use cases we have to implement.
Let me give you four very practical examples.
Example #1: The Site Reliability Engineer use case
If you are a feature team member responsible for the security of the team's subscription, it's not only important to known what people are entitled to, but also who these people are.
Your user story is: "as a SRE, I need to know all principals which have roles on my subscriptions (or my management groups or my resource groups), and what these roles are"
Example #2: The forensics analyst use case
Ouch! A subscription has been compromised! Now all is back to normal, but I still don't understand how the damn hacker managed to get in... Maybe some rights weren't revoked on time, or the rights attached to some people are too generous?
User story: "As a SOC team member, I need to have a holistic view on all roles and permissions attached to a given principal at a given point of time on a given scope".
Example #3: The IAM analyst use case
In large corporations, resorting to custom roles to match the complexity of the internal operating model is unavoidable. This can make the management and the understanding of fine grain permissions quickly intractable.
User story: "As an IAM analyst, I need to reason about built-in and custom roles to detect toxic combinations in and across the Control and Data planes".
Special use case (example #4): The pen tester / blue teamer
And finally we have an opportunity to tamper with any RBAC resource without actually harming or weakening the security of AAD or Azure! Isn't that great?
领英推荐
Use story: "As a pen tester or a blue teamer, I want to inject and simulate faults in Azure IAM reference data. I want to try out infiltration scenarios and probe detection controls by modifying Azure RBAC in an arbitrary way".
AAD role model flattening
In all these situations, getting the information in a fast and straightforward way is far from obvious.
But the good news is that in the first three use cases, requests are read only. This opens an avenue for moving away from the tight coupling standing between the internal representation of RBAC objects and the exposed API.
How can we do that?
One possible way is to "flatten" the RBAC model into a (small) number of key-value stores (depicted in yellow in the image below). Each store corresponds to a simple view on denormalized data fetched through a simple key-based API. The representation is impervious from the internal state of AAD.
This is not as simple as it seems since one needs to :
a) patiently and exhaustively poll every single resource container [*] with the role assignment API to generate the various key/value mappings, taking scope inheritance into account
b) recursively expand all AAD groups encountered during this process into a list of individual keys (one for each group member).
Here are potential (and simplified) store schemes that could help implement our first three examples:
Meeting the needs of the fourth use case (the pen tester / blue teamer) is not shown here: it is a simple matter of giving full admin rights to a copy the various stores.
Stores can be implemented as a memory cache like Azure Redis Cache and mediated through a lightweight API in the style of Elasticsearch API or simply saved into JSON file. As far as I'm concerned, I use a mix of simple raw files and JSON files to store denormalized AAD data.
As usual, when one makes a local copy of a golden data source, there's going to be consistency issues: the refresh the KV stores is a long running process, therefore one could have to handle the data in an eventually consistent way. Since our stores are read only, this shouldn't be much of an issue in most of use cases.
Our best option remains of course that Microsoft addresses the issue head on and delivers the Azure IAM API of our dreams...
Conclusion
The purpose of this article is to:
Notes
[*] In ARG parlance, a resource container is an assets placeholder, i.e a management group, a subscription or a resource group. Hopefully you won't need to go as deep as the last one.
AVP of Advanced Analytics and Engineering at XIFIN, Inc.
3 年Thanks for sharing
Senior Security Specialist at Microsoft - aka.ms/gsd = Get Security Deployed
3 年Yann Paquaux