A primer of automated reasoning for Cloud engineers
https://creativepunking.com/how-saiva-academy-is-providing-sound-placements/

A primer of automated reasoning for Cloud engineers

Here's an invitation for an easy stroll through the intricacies of automated reasoning - the future of native cloud Security!

As a sequel to my first article, we are going to task Microsoft's amazing Z3 theorem prover with finding out whether a bunch of VPCs/VNets are connected through network peerings. Hopefully this example will get you accustomed with the basic concepts required for proving a wider range of AWS/Azure/GCP security patterns.

We will proceed in three steps: VNets acquisition, VNets partitioning and theorem proving.

Step #1: get some VNets

First we need to collect a listing of VNet IDs and their peers. If we get this information from an authoritative source then I believe we are allowed to call these peerings axioms. In our example, we will use Azure, so the authoritative source will be Azure Resource Graph.

We prepare a query to get all peered VNets belonging to Management Groups MG1 and MG2:

No alt text provided for this image

Then, we execute the query to get a stream of JSON objects that we store in a file (peers.json):

No alt text provided for this image

Believe it or not, but the deserialization of peers.json is our set of axioms! Why? Because a peering is a pair of VNet IDs, and peers.json contains the list of all such pairs.

Quite handy :)

Step #2: partition the VNets

Now let's review the pairs one by one. Each time a VNet ID v1 is peered to a VNet ID v2, we express this through an equivalence relation. As you may know, an equivalence relation is a binary relation that is reflexive, symmetric and transitive. More importantly, an equivalence relation defines a partition over a set.

Here we are going to use equality as our equivalence relation between VNet IDs. Equality is trivially reflexive (a==a), symmetric (if a==b then b==a) and transitive (if a==b and b==c then a==c), therefore it defines a partition over our set of VNet IDs.

Let's see out it plays out with a quick example. Imagine we have 5 VNets peered in the following way:

No alt text provided for this image

  • Since a is peered with b and c, we express the following equalities: a==b and a==c
  • Since b is only peered with a, we state b==a
  • Likewise for c, d and e: c==a, d==e, and e==d

No alt text provided for this image

If we submit these equality relations to an SMT solver able to handle Equality and Uninterpreted Functions (EUF), then it will automatically partition our set into 2 equivalence classes (the "green balls" to the right)

Here is a Python code snippet that shows how to add an equality constraint for each peering pair (p1,p2) into Z3:

No alt text provided for this image

A couple of comments:

  • (line 2) : Z3 expressions are typed. Here, our VNet IDs are of custom type VnetSort
  • (lines 6 and 10): since we don't know in advance the actual IDs of the VNets we are going to retrieve, we add them into Python's globals() dynamically as we iterate over the list

Step #3: prove (or disprove?) VNets connectivity

Now if you zoom in the green balls, you will see that they match the peering topology provided by our authoritative source to perfection:

No alt text provided for this image

To get this outcome with Z3, we first ensure the axioms we loaded up are satisfiable (by definition they always are, unless we've messed with ARG...), then we dump the resulting satisfiability model:

No alt text provided for this image

Now if a theorem of ours states that "all our VNets are meshed", then Z3 will disprove it because this assumption is not satisfiable based on our axioms. (Our VNets are clustered into two independent green balls, this contradicts the theorem)

Derivation is very fast, even when thousands of VNets are considered. The reason for it is that the process of "filling up" equivalence classes with VNet IDs doesn't depend on the underlying network topology, it only depends on the number of IDs under scrutiny.

That's all for today, unless you wish to...

... Go one step further?

Checking connectivity is nice, but a more useful automated reasoning control for Cloud networking is proving that the Hub & Spoke pattern is enforced in all your environments.

Auditing whether the Hub & Spoke pattern enforcement is in place can become intractable pretty quick for a large enterprise with thousands of peerings.


No alt text provided for this image

Last week I uploaded a technical paper on Github to explain how to implement this control with EUF. The proof relies not only on equalities (the 'E' part of EUF), but also on uninterpreted functions (the 'UF' part). I like to compare these beasts to quantum wave functions. They can take pretty much any value until a measurement is made! At first sight, UFs are quite intriguing but also a lot of fun to fiddle with.

Like any object in Z3, an UF is typed. Let's take the example of some simple function s(.) which takes a subscription ID as input and outputs an integer. Its Z3 signature is therefore:

No alt text provided for this image

In my Hub & Spoke proof, I use s(.) as an oracle. Under the hood, it binds an uninterpreted symbol (here: a subscription ID) to an interpreted integer of this very symbol by a home-made process that I used quite often in previous posts and articles and that I call arithmetization:

No alt text provided for this image

Somehow s(.) behaves like the collapse of a wave function in quantum mechanics: it "forces" the interpretation of an otherwise uninterpreted symbol. It is even stronger than that, because we can repeatedly[*] and consistently attach the same integer to any given symbol.

  • Without arithmetization, Z3 would merrily satisfy statements like subId1 == subId2 even if the subscriptions IDs are different, because subId1 and subId2 are only symbols awaiting to take any values,
  • With arithmetization we invite integers into our the realm of EUF theorems and axioms. It becomes possible to reason at the "level of numbers". We cannot satisfy subId1 == subId2 anymore if 1 is assigned to symbol subId1 and 2 is assigned to subId2.

Below is a rather useless but very simple Z3 example stating that for all x and y, if s(x) is not equal to s(y), then x is not equal to y.

No alt text provided for this image

The truth of such a statement may not spring to mind when we consider a mix of arithmetized and uninterpreted subscription IDs. But indeed it is satisfied by SMT:

  1. If x and y and two real life subscription Ids, if their integer hashes are different then by construction they are two distinct subscriptions, so the statement is satisfiable.
  2. Now if x or y (or both) are uninterpreted, then they can take any arbitrary value. If there are no other constraints than s(x)!=s(y), it's an elementary task for Z3 to assign x and y different values, so once again, the statement is satisfiable.

Conclusion

Congratulations! You now have all it takes to read my technical paper or to explore EUF on your own.

More very useful SMT proofs will be released over time for IaaS and PaaS: better get ready now! And why not start devising your own proofs?


Notes

[*]: Python's hash() function will return a persistent result if you set the PYTHONHASHSEED environment variable to 0 or if you set your own seed.

Tarak ??

Senior Growth Manager & CSM

2 年

Thanks for sharing Christophe! It’s well written and easily understandable!

Deming WANG

Cyber Security Architect

2 年

Thank you for sharing. Maybe you can improve continuously:) - VNet peering isn’t transitive - multi-level hub-and-spokes architecture isn’t recommended - cloud resource configuration can be changed dynamically by API. We need check the event logs and implement policy based control on the resource level.

要查看或添加评论,请登录

Christophe Parisel的更多文章

  • Adversarial lateral motion in Azure PaaS: are we prepared?

    Adversarial lateral motion in Azure PaaS: are we prepared?

    Lateral motion techniques are evolving in PaaS, and we should be worried. Let's discuss a risk confinement approach.

    19 条评论
  • How will Microsoft Majorana quantum chip ??compute??, exactly?

    How will Microsoft Majorana quantum chip ??compute??, exactly?

    During the 2020 COVID lockdown, I investigated braid theory in the hope it would help me on some research I was…

    16 条评论
  • Zero-shot attack against multimodal AI (Part 2)

    Zero-shot attack against multimodal AI (Part 2)

    In part 1, I showcased how AI applications could be affected by a new kind of AI-driven attack: Mystic Square. In the…

    6 条评论
  • Zero-shot attack against multimodal AI (Part 1)

    Zero-shot attack against multimodal AI (Part 1)

    The arrow is on fire, ready to strike its target from two miles away..

    11 条评论
  • 2015-2025: a decade of preventive Cloud security!

    2015-2025: a decade of preventive Cloud security!

    Since its birth in 2015, preventive Cloud security has proven a formidable achievement. By raising the security bar of…

    11 条评论
  • Exploiting Azure AI DocIntel for ID spoofing

    Exploiting Azure AI DocIntel for ID spoofing

    Sensitive transactions execution often requires to show proofs of ID and proofs of ownership: this requirements is…

    10 条评论
  • How I trained an AI model for nefarious purposes!

    How I trained an AI model for nefarious purposes!

    The previous episode prepared ground for today’s task: we walked through the foundations of AI curiosity. As we've…

    19 条评论
  • AI curiosity

    AI curiosity

    The incuriosity of genAI is an understatement. When chatGPT became popular in early 2023, it was even more striking…

    3 条评论
  • The nested cloud

    The nested cloud

    Now is the perfect time to approach Cloud security through the interplay between data planes and control planes—a…

    8 条评论
  • Overcoming the security challenge of Text-To-Action

    Overcoming the security challenge of Text-To-Action

    LLM's Text-To-Action (T2A) is one of the most anticipated features of 2025: it is expected to unleash a new cycle of…

    19 条评论