登录查看更多内容

Why does EVPN play smaller role in Cisco ACI?

Vahid Nazari

DC Consulting Engineer ? CISCO ACI ? VXLAN ? Hybrid & On-Prem Infra ? End-to-End integrated Solutions

发布日期: 2022年8月18日

Ethernet VPN, or EVPN, is one of the most well-known protocols in both service providers and data center fabrics. It well extends the BGP and makes it possible to include endpoint reachability information such as MAC and IP addresses, by adding an L2 address family. This fortunate confluence of BGP+EVPN then results in a strong and excellent control plane for VXLAN. That's how we name the comprehensive solution as "VXLAN MP-BGP EVPN". BGP EVPN provides significant enhancements for VXLAN such as "ARP Suppression", "Distributed IP Anycast Gateway", "Endpoint Mobility", "Virtual Port-Channel (vPC)" and so on.

Although Cisco ACI leverages VXLAN to build the infrastructure, when you come to the technology, you realize that some roles have changed and this may seem strange at first. Pay attention : EVPN is still used as part of Overlay control plane protocols in some ACI solutions in order to exchange endpoint reachability information across Pods or Sites. (For instance : ACI Multi-Pod, ACI Multi-Site and ACI Remote leaf). However, it's no longer utilized within any of the pods anymore. At first, you might think that it has simply been replaced by another protocol. But the subject goes beyond the scope of these words! Endpoint learning mechanisms in ACI has basically gone a different way, comparing to VXLAN BGP EVPN. We've heard a lot about BGP EVPN's features. So why did these changes happen?

To make a long story short, Cisco ACI is not supposed to do the same thing that VXLAN does! Rather, this technology has given rise to larger ambitions and dreams.

Generally speaking, comparing Cisco ACI with VXLAN BGP EVPN is basically wrong! since they are not the same in what they are supposed to do. It is obvious that, Some enhancements may be required to accommodate new features, and greater goals. Of course they don't just include EVPN!

How is endpoint learning done in VXLAN BGP EVPN?

In VXLAN BGP EVPN, all leaf nodes within a single fabric will advertise, learn, and store all endpoint information, even if there is never an endpoint behind a switch that requires this data. It's one of the situations in which Cisco ACI is dissatisfied.

To begin with, Cisco ACI is more than simply a large network switch. ACI provides a zero-trust network, which actually takes the shape of a massive firewall. Just as there may be a vast number of security policies in firewalls, there are in Cisco ACI, called zoning rules that are programmed on leaf nodes. It means they need resources. At the same time, this technology would have to be capable of supporting very huge infrastructures. As a preliminary conclusion, the resources have to be used more efficiently.

The hardware resource savings are a huge advantage for scalable fabric.

According to the mentioned above, leaf nodes in ACI don’t have to consume their hardware resources to store information about all the endpoints. Rather, they store only the necessary information for remote endpoints with which the leaf is actively communicating.
Cisco ACI Endpoint learning provides scalable forwarding within the fabric. For instance, in BGP+EVPN for each movement of an endpoint, a new update message is sent to all leaf nodes. In ACI on other other hand, there are bounce entries through which only three components need to be updated during each movement, regardless of how many leaf switches the fabric contains.

Furthermore, VXLAN BGP EVPN has no appropriate option for Stretched Fabric! of course there is basically EVPN Multipod, along with Multi-Fabric and Multi-Site solutions. But duo to some critical shortcomings exist in VXLAN Multipod, It's essentially not recommended to use for Multi-Pod or Active-Active datacenters which are geographically dispersed (I'll go over its issues letter). This means that even for Multi-datacenter environments that are all part of a same Stretched-Fabric topology, we have to choose VXLAN Multi-site solution in which assumed there are several separated VXLAN fabrics interconnected together.

Of course VXLAN Multi-Site is recognized as a brilliant technology that provides both Layer 2 and Layer 3 interconnections for completely independent VXLAN fabrics, but the main use case of this solution is for DCI. there is no end-to-end VXLAN tunneling in this solution that is, for each inter-DC sending and receiving of just a single packet, VXLAN encapsulation is done 6 times! this would potentially problematic for mission critical Applications such as high-frequency trading, virtual reality over networks, peak conditions of banking transactions and so on.

Cisco ACI has also introduced Multi-Site solution in which both Layer 2 and Layer 3 communications across ACI Fabrics are possible. Nevertheless, ACI has intensely concentrated on stretched-fabric topology relying on Multi-Pod solution, where there are significant enhancements in terms of endpoint learning mechanisms and Failure domain isolation.

That's how Cisco has attempted to provide one well-fitting ACI solution for every scenario and requirement.

In general, 'Cisco ACI Multi-Pod', 'ACI Remote Leaf' and ' ACI vPod' are all part of ACI Stretched Fabric solutions, through which configure and maintain using one APIC Cluster. Each solution however is in compliance with a specific condition, in which even the simplest and smallest infrastructures have been paid attention to.
For instance, ACI Multi-Pod is introduced for either a very large infrastructure, in which it may not be possible to deploy a single Leaf-and-Spine architecture, or Active/Active sites which are geographically dispersed (Single AZ in AWS terminology). On the flip side, there may be a small remote site where it may not possible or desirable to deploy a full ACI Pod (with leaf and spine nodes); ACI Remote Leaf would be a solid option for that. In addition to above, there also may be a DC with very small footprint in which the backups are stored only and no service is provided on it. we can leverage on?a software-only extension for such these sites, where there are basically no physical Leaf and Spine nodes.

How is endpoint learning done in Cisco ACI?

In contrast, Cisco ACI learns endpoint information in the data plane during packet forwarding, so there is no MP-BGP+EVPN up and running inside each ACI Pod.

Keep that in mind; the MP-BGP with VPNv4 alone still exists in the Overlay-1 VRF inside the Infra Tenant. It's used to distribute external routes from border leafs to other leaf switches.

Cisco ACI relies on the resources of spine switches instead of leafs, to store and collect all endpoint information.

It sounds more efficient.

ACI actually uses the Council Of Oracle Protocol (COOP) database located on each spine switch, known as an "Oracle". Since the hosts are directly connected to leafs, each leaf, which is known as a "citizen," is responsible for reporting its local endpoints to the COOP database. As a result, all endpoint information in ACI Fabric is stored in the spine COOP database. Consequently, there is no need for a leaf switch to already have remote endpoint information, whereas it could easily forward packets to the spine in the event that it doesn't know about a particular remote endpoint. This forwarding behavior is called "hardware proxy" or "spine proxy".

A key fact: A leaf switch already has endpoint information thanks to BGP+EVPN, but is there an option other than transmitting traffic to a spine? Is there anything special? like the packet is flying? Hardware-proxy means : Hey, lovely Leaf. You have no idea about the destination? It's Ok. don't butter yourself. Just keep forwarding! I know what should I do. Some engineers I've spoken, believed that leaf switches query the spine, catch the information, and then forward the traffic. It's incorrect. Hardware proxy doesn't work like that!

领英推荐

IP Services HSRP, VRRP & GLBP

Free Online Courses With Printable Certificates 1 年前

Cisco Catalyst SD-WAN Data Plane : Guide for 2023

Nam Nguyen 1 年前

BGP Add-Path: Enhancing Path Visibility in Networks

Theophilus Bittok 8 个月前

Don't worry about silent hosts, endpoint mobility, or even the movement of an IP address to a new MAC. The solutions are already provided.

Now, let's drill down more on VXLAN Multipod solution and go over the issues it has.

VXLAN BGP EVPN Multi-Pod

The first solution for extending the VXLAN fabric to more than one infrastructure is illustrated below, known as the VXLAN Multipod. ((Approximately deprecated!)) Of course, it's the most practical way of implementing this solution in which control plane protocols are isolated across two pods.

Even though the control plane protocols, including Underlay IGP and Overlay BGP, are separated from each other, the same VXLAN EVPN Fabric is extended across different locations, which leads the whole infrastructure to function like a single VXLAN fabric. In this situation, all endpoint information including MAC and IP are shared and advertised between two pods. With that said, for each movement of one endpoint across leaf nodes in Pod1, a new control plane update is sent towards Pod2 since it's the default behavior of BGP+EVPN in a single VXLAN Fabric (End-to-End EVPN Updates). The failure domain is kept extended across all the pods, and the scalability remains the same as it's again! a single VXLAN fabric. Eventually, as I mentioned before, all leaf switches have to learn all the endpoint information. Because of these shortcomings, this solution is not recommended for active/active geographically dispersed sites.

Cisco ACI Multi-Pod meanwhile, is completely a different story!

ACI Multi-Pod is also a single ACI Fabric however, in this solution, different instances of IS-IS, COOP, and MP-BGP protocols run locally inside each Pod. It makes end-to-end VXLAN tunneling possible while providing enhancements to isolate as much as possible the failure domains between Pods.
Local endpoint information belonging to each Pod are never advertised to remote leaf devices beforehand. Instead, ACI Leaf switches will learn remote MAC and IP information during the packet forwarding in the data plane (ideally, hardware proxy-based forwarding), just like what happens in a single pod.

Why not outsource the responsibility for keeping endpoint information to spines? Don't forget that the COOP database within each pod already contains all the local endpoint information, and they will be synchronized across pods through MP-BGP EVPN.

Each pod has a dedicated 'Anycast VTEP address' which is available on all the spine nodes deployed in it. The first time that endpoint EP1 in pod1 is added to the local COOP database, an MP-BGP update is sent to the remote spine nodes in Pod2. The remote spine node adds the information to the COOP database and also synchronizes it with all the other local spine nodes. The EP1 in Pod 2 is now associated with the Anycast VTEP address of Pod 1, which is considered the next-hop address for Hardware Proxy. In this way, EP1 can move either way on Pod 1, but no new updates will be sent to Pod 2.

The VXLAN Multipod has been replaced by more effective solutions, such as VXLAN Multi-Fabric and, especially, the VXLAN Multi-Site, which is one of the most appropriate solutions in traditional multi-datacenter environments.

VXLAN BGP EVPN Multi-Site

EVPN Multi-Site technology is based on IETF draft-sharma-multi-site-evpn. In this solution, there are two or more completely independent VXLAN fabrics that are interconnected together through a VXLAN BGP EVPN Layer 2 and Layer 3 overlay. This overlay network is also known as 'Site-External network'. Thus, unlike the previous Multipod architecture, there is neither a shared EVPN fabric nor an extended underlay across different sites.

VXLAN EVPN Multi-Site could be used for scaling-up a large intra-DC network, Datacenter Interconnect (DCI) and also integrate with legacy networks. As you can see in the picture above, the key functional components of this architecture are Border Gateways (BGW).

If you wish to learn more about VXLAN Multi-Site, click on the link below.

In a brief explanation, BGWs separate the VXLAN fabric-side from the Site-External network and mask the site-internal VTEPs. Now what does this mean, As it is shown in the following picture, the border gateway re-encapsulates traffic and changes the outer source and the outer destination. That is, for each sending and receiving of traffic, VXLAN encapsulation is done six times.

Leaf10 to BGW11
BGW11 to BGW of the next site
BGW22 to leaf20
Leaf20 to BGW21
BGW21-to-BGW of the first site
BGW11 to leaf10

Of course the forwarding enhancements are not limited to EVPN, But also no Multicast PIM is up and running for handling BUM traffics within ACI Fabric. ACI relies on FTAG mechanism in order to make a path just like a Multicast Tree among Leaf and Spine nodes.

Leon Lai

1 年

Good sharing. Thanks

Benzo Bagheri

network consultant

2 年

interesting explanations thanks

1 次回应

Md.Masroor Ahmed

A meticulous professional with prominent experience as Network Engineer

2 年

Excellent

1 次回应

查看更多评论

要查看或添加评论，请登录

Vahid Nazari的更多文章

VLAN vs. Bridge Domain in Cisco ACI: Clearing Up the Confusion for Networking Professionals

2024年10月14日

VLAN vs. Bridge Domain in Cisco ACI: Clearing Up the Confusion for Networking Professionals

I’ve frequently been asked, and I’ve also observed in my classes, that many newcomers to Cisco ACI get confused about…

5 条评论
The RoadMap to Cisco ACI; Whys and wherefores

2021年8月16日

The RoadMap to Cisco ACI; Whys and wherefores

Why Cisco ACI?? Indeed, This is still an unresolved technical question for some network engineers, managers, and even…

6 条评论
Cisco ACI: A Practical Deep Dive into demystifying How Application-Centric is different from Network-Centric.

2021年6月26日

Cisco ACI: A Practical Deep Dive into demystifying How Application-Centric is different from Network-Centric.

Cisco ACI: Application Centric vs Network Centric From the first moment you get involved with Cisco ACI in order to…

3 条评论
CISCO ACI 5.2: From a new Topology to the elimination of Managed mode service Graph and also new features added.

2021年6月12日

CISCO ACI 5.2: From a new Topology to the elimination of Managed mode service Graph and also new features added.

Cisco ACI 5.2 has been released and is available for download since Jun 8, 2021 and we see some significant changes and…
Why You shouldn't Think about Fabric Extenders (FEX) along with Cisco ACI anymore?

2021年4月3日

Why You shouldn't Think about Fabric Extenders (FEX) along with Cisco ACI anymore?

Most customers who want to upgrade their data center fabric and are happily researching around Cisco ACI in this regard…

4 条评论

See all articles

Why does EVPN play smaller role in Cisco ACI?

Vahid Nazari

DC Consulting Engineer ? CISCO ACI ? VXLAN ? Hybrid & On-Prem Infra ? End-to-End integrated Solutions

How is endpoint learning done in VXLAN BGP EVPN?

How is endpoint learning done in Cisco ACI?

领英推荐

VXLAN BGP EVPN Multi-Pod

VXLAN BGP EVPN Multi-Site

Vahid Nazari的更多文章

社区洞察

其他会员也浏览了

OcNOS 6.6.0 GA – Everything You Need to Know About the New Release

Breaking the POTS-to-IP Barrier: Secure, Seamless Migration for Mission-Critical Devices

EVPN ETREE

CCNA : EtherChannel

The Need for Speed: Transitioning to 10 Gigabit Ethernet

SRv6: Network Intelligent

The Value of Using ECMP in an MPBGP Network

MPLS to SRv6 transition, is it worth the hype? and why it matters today?

SDWAN Solution- Does it really bring cost saving!

How is endpoint learning done in VXLAN BGP EVPN?

How is endpoint learning done in Cisco ACI?

领英推荐

VXLAN BGP EVPN Multi-Pod

VXLAN BGP EVPN Multi-Site

Vahid Nazari的更多文章

VLAN vs. Bridge Domain in Cisco ACI: Clearing Up the Confusion for Networking Professionals

The RoadMap to Cisco ACI; Whys and wherefores

Cisco ACI: A Practical Deep Dive into demystifying How Application-Centric is different from Network-Centric.

CISCO ACI 5.2: From a new Topology to the elimination of Managed mode service Graph and also new features added.

Why You shouldn't Think about Fabric Extenders (FEX) along with Cisco ACI anymore?

社区洞察

其他会员也浏览了

OcNOS 6.6.0 GA – Everything You Need to Know About the New Release

Breaking the POTS-to-IP Barrier: Secure, Seamless Migration for Mission-Critical Devices

EVPN ETREE

CCNA : EtherChannel

The Need for Speed: Transitioning to 10 Gigabit Ethernet

SRv6: Network Intelligent

The Value of Using ECMP in an MPBGP Network

MPLS to SRv6 transition, is it worth the hype? and why it matters today?

SDWAN Solution- Does it really bring cost saving!