ACI Multi-Site Part#3 || Packet Flow
Shehab Wagdy Nagy
Cloud Enthusiast: AWS | CCIE | SDN Solutions | ACI | Network Automation Enthusiast
ACI Multi-Site deployment uses different overlay control and data plane functionalities for connecting endpoints that are deployed across different sites.
MP-BGP EVPN is used as a control plane between spine switches for exchanging host information for discovered endpoints that are part of separate fabrics to allow east-west communication between these endpoints.
And once those endpoints information is exchanged, the VXLAN data plane isused to allow intersite communications.
Let's go and check how control and data plane is handled in details.
Multi-Site Overlay Control Plane
As we know so far, for endpoints in different sites to communicate to each other, their EID need to be shared between sites. So ACI Multi-Site uses MP-BGP EVPN between spine switches across sites.
And for endpoint information to be shared to other sites, Cisco NDO need to indicate which EPG to stretch across sites.
or if the endpoint is related to different non-stretched EPG with a contract that allow the communication to EPG in different sites. So there are two scenarios here:
1- If the Endpoint has IP address,So it is shared across sites via MP-BGP EVPN.
2- If the Endpoint without IP address, So it is shared only when Layer 2 STRETCH is enabled on the NDO.
Before going to Overlay Data Plane, Let us understand how MP-BGP EVPN is used to share endpoint information across sites:
Multi-Site Overlay Data Plane
After endpoint information is exchanged across sites, the VXLAN data plane is used to allow intersite Layer 2 and Layer 3 communication. Let's go and understand the how ACI Multi-site handle the different traffic scenarios (BUM and Unicast traffic).
BUM Traffic between sites:
The deployment of VXLAN allows the use of a logical abstraction so that endpoints separated by multiple Layer 3 hops can communicate as if they were part of the same logical Layer 2 domain.
ACI Multi-Site enables ingress replication for this traffic (BUM traffic) on the source VXLAN TEP (VTEP) devices, which create multiple unicast copies of each BUM frame to be sent to the all remote VTEPs on which those endpoints are part of same layer 2 domain are connected, Once the BUM frame encapsulated with a unicast O-MTEP reaches the destination site, the destination site replicates the BUM frame and floods it within the site, which is the headend replication.
BUM traffic is forwarded to other sites only when Intersite BUM Traffic Allow is enabled on the bridge domain to let the flooded traffic to reach other sites as well.
There are three different types of Layer 2 BUM traffic:
In the below example will check how Layer 2 BUM traffic occured:
Depending on the number of configured bridge domains, the same GIPo address may be associated with different bridge domains. Thus, when flooding for one of those bridge domains is enabled across sites, BUM traffic for the other bridge domains using the same GIPo address is also sent across the sites and will then be dropped on the received spine nodes. This behavior can increase the bandwidth utilization in the intersite network.
领英推荐
So to solve this issue and don't utilize the ISN bandwidth for useless traffic, when a bridge domain is configured as stretched with "Intersite BUM Traffic Allow" enabled from the Cisco NDO, by default a GIPo address is assigned from a separate range of multicast addresses. It is reflected in the user interface by the "Optimize WAN Bandwidth" flag, which is enabled by default for the bridge domain created by the NDO.
In the below example, we can verify the configured policy by NDO which is reflected on ACI.
Unicast Traffic between sites:
For any endpoint on the same subnet to be able to communicate together, they need to exchange the ARP information, and in ACI ARP handling is depends on the ARP Flooding feature in the bridge domain.
There are two different scenarios to consider:
8. EP2 sends ARP reply to the EP1.
9. The local leaf node encapsulates the traffic to remote O-UTEP A address.
10. The spine nodes also rewrite the source IP address of the VXLAN-encapsulated packet, with the local O-UTEP B address identifying the Site 2.
11. The VXLAN frame is received by the spine node, which translates the original VNID and class ID values of Site 2 to locally significant ones (Site 1) and sends it toward the local leaf node to which EP1 is connected.
12. The leaf node receives the frame, de-encapsulates it, and learns the class ID and site location information for remote endpoint EP2.
13. The frame is then sent to the local interface on the leaf node and reaches EP1.
Let's recap we have discovered in this topic:
See you in next topic.
Thanks alot.
Resources:
CCIE RS #64605 - Network Consultant at SFR ( Backbone MPLS/BGP - Grands Comptes )
10 个月Thanks for sharing Veru useful