A primer of automated reasoning for Cloud engineers
https://creativepunking.com/how-saiva-academy-is-providing-sound-placements/

A primer of automated reasoning for Cloud engineers

Here's an invitation for an easy stroll through the intricacies of automated reasoning - the future of native cloud Security!

As a sequel to my first article , we are going to task Microsoft's amazing Z3 theorem prover with finding out whether a bunch of VPCs/VNets are connected through network peerings. Hopefully this example will get you accustomed with the basic concepts required for proving a wider range of AWS/Azure/GCP security patterns.

We will proceed in three steps: VNets acquisition, VNets partitioning and theorem proving.

Step #1: get some VNets

First we need to collect a listing of VNet IDs and their peers. If we get this information from an authoritative source then I believe we are allowed to call these peerings axioms. In our example, we will use Azure, so the authoritative source will be Azure Resource Graph.

We prepare a query to get all peered VNets belonging to Management Groups MG1 and MG2:

No alt text provided for this image

Then, we execute the query to get a stream of JSON objects that we store in a file (peers.json):

No alt text provided for this image

Believe it or not, but the deserialization of peers.json is our set of axioms! Why? Because a peering is a pair of VNet IDs, and peers.json contains the list of all such pairs.

Quite handy :)

Step #2: partition the VNets

Now let's review the pairs one by one. Each time a VNet ID v1 is peered to a VNet ID v2, we express this through an equivalence relation. As you may know, an equivalence relation is a binary relation that is reflexive, symmetric and transitive. More importantly, an equivalence relation defines a partition over a set.

Here we are going to use equality as our equivalence relation between VNet IDs. Equality is trivially reflexive (a==a), symmetric (if a==b then b==a) and transitive (if a==b and b==c then a==c), therefore it defines a partition over our set of VNet IDs.

Let's see out it plays out with a quick example. Imagine we have 5 VNets peered in the following way:

No alt text provided for this image

  • Since a is peered with b and c, we express the following equalities: a==b and a==c
  • Since b is only peered with a, we state b==a
  • Likewise for c, d and e: c==a, d==e, and e==d

No alt text provided for this image

If we submit these equality relations to an SMT solver able to handle Equality and Uninterpreted Functions (EUF), then it will automatically partition our set into 2 equivalence classes (the "green balls" to the right)

Here is a Python code snippet that shows how to add an equality constraint for each peering pair (p1,p2) into Z3:

No alt text provided for this image

A couple of comments:

  • (line 2) : Z3 expressions are typed. Here, our VNet IDs are of custom type VnetSort
  • (lines 6 and 10): since we don't know in advance the actual IDs of the VNets we are going to retrieve, we add them into Python's globals() dynamically as we iterate over the list

Step #3: prove (or disprove?) VNets connectivity

Now if you zoom in the green balls, you will see that they match the peering topology provided by our authoritative source to perfection:

No alt text provided for this image

To get this outcome with Z3, we first ensure the axioms we loaded up are satisfiable (by definition they always are, unless we've messed with ARG...), then we dump the resulting satisfiability model:

No alt text provided for this image

Now if a theorem of ours states that "all our VNets are meshed", then Z3 will disprove it because this assumption is not satisfiable based on our axioms. (Our VNets are clustered into two independent green balls, this contradicts the theorem)

Derivation is very fast, even when thousands of VNets are considered. The reason for it is that the process of "filling up" equivalence classes with VNet IDs doesn't depend on the underlying network topology, it only depends on the number of IDs under scrutiny.

That's all for today, unless you wish to...

... Go one step further?

Checking connectivity is nice, but a more useful automated reasoning control for Cloud networking is proving that the Hub & Spoke pattern is enforced in all your environments.

Auditing whether the Hub & Spoke pattern enforcement is in place can become intractable pretty quick for a large enterprise with thousands of peerings.


No alt text provided for this image

Last week I uploaded a technical paper on Github to explain how to implement this control with EUF . The proof relies not only on equalities (the 'E' part of EUF), but also on uninterpreted functions (the 'UF' part). I like to compare these beasts to quantum wave functions. They can take pretty much any value until a measurement is made! At first sight, UFs are quite intriguing but also a lot of fun to fiddle with.

Like any object in Z3, an UF is typed. Let's take the example of some simple function s(.) which takes a subscription ID as input and outputs an integer. Its Z3 signature is therefore:

No alt text provided for this image

In my Hub & Spoke proof, I use s(.) as an oracle. Under the hood, it binds an uninterpreted symbol (here: a subscription ID) to an interpreted integer of this very symbol by a home-made process that I used quite often in previous posts and articles and that I call arithmetization:

No alt text provided for this image

Somehow s(.) behaves like the collapse of a wave function in quantum mechanics: it "forces" the interpretation of an otherwise uninterpreted symbol. It is even stronger than that, because we can repeatedly[*] and consistently attach the same integer to any given symbol.

  • Without arithmetization, Z3 would merrily satisfy statements like subId1 == subId2 even if the subscriptions IDs are different, because subId1 and subId2 are only symbols awaiting to take any values,
  • With arithmetization we invite integers into our the realm of EUF theorems and axioms. It becomes possible to reason at the "level of numbers". We cannot satisfy subId1 == subId2 anymore if 1 is assigned to symbol subId1 and 2 is assigned to subId2.

Below is a rather useless but very simple Z3 example stating that for all x and y, if s(x) is not equal to s(y), then x is not equal to y.

No alt text provided for this image

The truth of such a statement may not spring to mind when we consider a mix of arithmetized and uninterpreted subscription IDs. But indeed it is satisfied by SMT:

  1. If x and y and two real life subscription Ids, if their integer hashes are different then by construction they are two distinct subscriptions, so the statement is satisfiable.
  2. Now if x or y (or both) are uninterpreted, then they can take any arbitrary value. If there are no other constraints than s(x)!=s(y), it's an elementary task for Z3 to assign x and y different values, so once again, the statement is satisfiable.

Conclusion

Congratulations! You now have all it takes to read my technical paper or to explore EUF on your own.

More very useful SMT proofs will be released over time for IaaS and PaaS: better get ready now! And why not start devising your own proofs?


Notes

[*]: Python's hash() function will return a persistent result if you set the PYTHONHASHSEED environment variable to 0 or if you set your own seed.

Tarak ??

Senior PM/ PMM | Growth Manager | ex-CPO

2 年

Thanks for sharing Christophe! It’s well written and easily understandable!

Deming WANG

Cyber Security Architect

2 年

Thank you for sharing. Maybe you can improve continuously:) - VNet peering isn’t transitive - multi-level hub-and-spokes architecture isn’t recommended - cloud resource configuration can be changed dynamically by API. We need check the event logs and implement policy based control on the resource level.

要查看或添加评论,请登录

Christophe Parisel的更多文章

  • The nested cloud

    The nested cloud

    Now is the perfect time to approach Cloud security through the interplay between data planes and control planes—a…

    7 条评论
  • Overcoming the security challenge of Text-To-Action

    Overcoming the security challenge of Text-To-Action

    LLM's Text-To-Action (T2A) is one of the most anticipated features of 2025: it is expected to unleash a new cycle of…

    19 条评论
  • Cloud drift management for Cyber

    Cloud drift management for Cyber

    Optimize your drift management strategy by tracking the Human-to-Scenario (H/S) ratio: the number of dedicated human…

    12 条评论
  • From Art to Craft: A Practical Approach to Setting EPSS Thresholds

    From Art to Craft: A Practical Approach to Setting EPSS Thresholds

    Are you using an EPSS threshold to steer your patch management strategy? Exec sum / teaser EPSS is an excellent exposer…

    13 条评论
  • The security of random number generators (part 1)

    The security of random number generators (part 1)

    Cryptography as we know it today was born in the seventies, when Diffie and Helmann invented public key cryptosystems…

    13 条评论
  • How Microsoft is modernizing Azure

    How Microsoft is modernizing Azure

    Clearly, Microsoft put a lot of love in the making of Azure Bicep. Unlike its perplexing parent, ARM templates, all the…

  • A fresh take on time series forecasting

    A fresh take on time series forecasting

    We introduce a new machine learning technique that outperforms XG Boost for anticipating some critical EPSS (Exploit…

    8 条评论
  • The threat of Azure service tags

    The threat of Azure service tags

    Like all real disruptions, firewall objects have a sunny and a dark side. In Azure, the most important firewall objects…

    11 条评论
  • State of Confidential Containers (opinionated)

    State of Confidential Containers (opinionated)

    Confidential containers (CoCo) are the next frontier in Cloud PaaS for they deliver the promise of noOps containerized…

    5 条评论
  • Celebrating my 100th article!

    Celebrating my 100th article!

    For this special edition, I'm not going to play by the usual script, I won’t hand-pick a selection of ? recommended ?…

    10 条评论