VXLAN EVPN Multi-Site Deployment

VXLAN EVPN Multi-Site Deployment

Hello everyone,

Today I would like to talk and more importantly show the configuration steps for the VXLAN EVPN Multi-site solution. Before we begin here are some prerequisites before reading this article:

  • This article assumes that the reader has a solid understanding of the VXLAN EVPN.
  • Almost little to no VXLAN EVPN Multi-Site theory will be covered in this article.
  • This article assumes that the reader has gone over the theory in depth and is familiar with the different VXLAN deployment types.
  • This deployment doesn't have any multicast configurations. Instead of multicast ingress replication is used
  • The lab is built using Nexus9000 C9300v (nxos64-cs.10.3.1.F) switches as Leafs, Spines, and BGWs.

To understand the technology better please familiarize yourself with the official white paper and configuration guide.

Let's start with the lab design overview:

  1. Each site has four leaves, two spines, and one border gateway.
  2. IGP domain is not common between sites. OSPF is used as IGP and the process within site one does not know the topology of site two.
  3. vPC (virtual port channel) is used to connect Leaf switches to hosts.
  4. IPv4 BGP AF is used among BGW-1, BGW-2, and Route Server to exchange routing information.
  5. In this article I'm going to present the minimum configuration steps possible to get Multisite working, later on, you can always add complexity and tweak your device configurations.
  6. To make this article nice and neat the configuration will be presented for one device of a kind (e.g. leaf-1, spine-1, bgw-1, and rs)


Configuration

Let's enable the following features on the Leaf-1

cfs eth distribute
nv overlay evpn
feature ospf
feature bgp
feature fabric forwarding
feature interface-vlan
feature vn-segment-vlan-based
feature lacp
feature vpc
feature nv overlay        

The vPC is configured towards the L2 switch that acts as an intermediary device for host connectivity.

I'm going to use the mgmt interface as a keepalive link, and E1/1-2 as peerlink. E1/7 is used on both leaves as downstream links to SW-1. I'm also going to create VLANs and assign vn-segments to them.

interface mgmt0
  vrf member management
  ip address 192.168.0.1/30

vpc domain 1
  peer-keepalive destination 192.168.0.2 source 192.168.0.1 vrf management

interface Ethernet1/1
  switchport mode trunk
  channel-group 100 mode active

interface Ethernet1/2
  switchport mode trunk
  channel-group 100 mode active

interface Ethernet1/7
  switchport mode trunk
  channel-group 1 mode active

interface port-channel1
  switchport mode trunk
  vpc 1

interface port-channel100
  switchport mode trunk
  spanning-tree port type network
  vpc peer-link

#vlan to vni mappings
vlan 10
  vn-segment 10
vlan 20
  vn-segment 20
vlan 100
  vn-segment 100        

Let's configure IP-VRF for our L3 VNI, SVIs, anycast gateway MAC address, and NVE (VTEP) interface.

fabric forwarding anycast-gateway-mac 0000.0000.0001

vrf context VXLAN
  vni 100
  rd auto
  address-family ipv4 unicast
    route-target both auto
    route-target both auto evpn

interface Vlan10 #host-facing SVI
  no shutdown
  vrf member VXLAN
  ip address 192.168.10.1/24
  fabric forwarding mode anycast-gateway

interface Vlan20 #host-facing SVI
  no shutdown
  vrf member VXLAN
  ip address 192.168.20.1/24
  fabric forwarding mode anycast-gateway

interface Vlan100 #L3 SVI for VXLAN routing purposes
  no shutdown
  vrf member VXLAN
  ip forward

interface nve1
  no shutdown
  host-reachability protocol bgp
  source-interface loopback0
  member vni 10
    suppress-arp
    ingress-replication protocol bgp #is used for BUM traffic
  member vni 20
    suppress-arp
    ingress-replication protocol bgp #is used for BUM traffic
  member vni 100 associate-vrf        

Also, we have to configure a matching secondary IP address of a loopback (in this case I'm using loopback0) for our VPC to work.

interface loopback0
  ip address 1.1.1.1/32
  ip address 3.3.3.3/32 secondary        
The vPC is up and running

I've configured OSPF on interfaces that are connected to spines. Just a quick reminder: spines are connected to leaves and leaves to spines, but spines are not connected to other spines, and leaves are not connected to other leaves. More on that can be found here.

The underlay configuration of Site-1 is pretty straightforward.

router ospf 1
  router-id 1.1.1.1

interface Ethernet1/3
  no switchport
  ip address 10.0.0.2/30
  ip router ospf 1 area 0.0.0.0
  no shutdown

interface Ethernet1/4
  no switchport
  ip address 10.4.4.2/30
  ip router ospf 1 area 0.0.0.0
  no shutdown

interface loopback0
  ip router ospf 1 area 0.0.0.0        

The last thing we need to do on our Leaf-1 is to configure BGP.

We're peering with Spine 1 and 2 loopbacks, we're sending extended communities because as you might remember route-target values are carried as extended communities in the BGP Update message.

router bgp 65000
  address-family l2vpn evpn
    retain route-target all
  neighbor 4.4.4.4
    remote-as 65000
    update-source loopback0
    address-family l2vpn evpn
      send-community
      send-community extended
  neighbor 5.5.5.5
    remote-as 65000
    update-source loopback0
    address-family l2vpn evpn
      send-community
      send-community extended
evpn
  vni 10 l2
    rd auto
    route-target import auto
    route-target export auto
  vni 20 l2
    rd auto
    route-target import auto
    route-target export auto        

Note: I'm configuring route-target and route distinguisher as auto instead of setting an explicit value. Within Cisco NX-OS, the auto-derived Route-Target is constructed BGP ASN and the VNI. Given that all four leaves are in the same iBGP domain and the VLAN to VNI mappings are identical this works perfectly fine. How this is going to work for Multisite? We'll get to that later.

Next, let's move to one Spine-1 configuration. Usually, the Spine doesn't need any VXLAN-related configurations, we only going to configure OSPF for underlay and BGP L2VPN afi EVPN safi to reflect the updates from leaves.

With that being said Spine would require fewer features enabled:

feature ospf
feature bgp
feature nv overlay
nv overlay evpn        

The OSPF configuration is also very straightforward:

router ospf 1
  router-id 4.4.4.4

interface Ethernet1/2
  no switchport
  ip address 10.10.10.1/30
  ip router ospf 1 area 0.0.0.0
  no shutdown

interface Ethernet1/3
  no switchport
  ip address 10.0.0.1/30
  ip router ospf 1 area 0.0.0.0
  no shutdown

interface Ethernet1/4
  no switchport
  ip address 10.8.8.1/30
  ip router ospf 1 area 0.0.0.0
  no shutdown

interface Ethernet1/5
  no switchport
  ip address 10.1.1.1/30
  ip router ospf 1 area 0.0.0.0
  no shutdown

interface Ethernet1/6
  no switchport
  ip address 10.9.9.1/30
  ip router ospf 1 area 0.0.0.0
  no shutdown

interface loopback0
  ip address 4.4.4.4/32
  ip router ospf 1 area 0.0.0.0
        

The BGP configuration isn't that much different from Leaf-1's config, however, we're adding route-reflector-client under L2VPN EVPN to overcome the split-horizon loop prevention mechanism in iBGP. Also, this configuration does include peering with BGW-1 and as you may noticed route-reflector-client is missing, why is that? As we know routes from route reflector clients are passed to non-clients therefore it's not necessary to configure BGW-1 as a route reflector client.

router bgp 65000
  router-id 4.4.4.4
  address-family l2vpn evpn
    retain route-target all
  neighbor 1.1.1.1 #Leaf-1
    remote-as 65000
    update-source loopback0
    address-family l2vpn evpn
      send-community
      send-community extended
      route-reflector-client
  neighbor 2.2.2.2 #Leaf-2
    remote-as 65000
    update-source loopback0
    address-family l2vpn evpn
      send-community
      send-community extended
      route-reflector-client
  neighbor 7.7.7.7 #Leaf-3
    remote-as 65000
    update-source loopback0
    address-family l2vpn evpn
      send-community
      send-community extended
      route-reflector-client
  neighbor 8.8.8.8 #Leaf-4
    remote-as 65000
    update-source loopback0
    address-family l2vpn evpn
      send-community
      send-community extended
      route-reflector-client
  neighbor 77.77.77.77 #BGW-1
    remote-as 65000
    update-source loopback0
    address-family l2vpn evpn
      send-community
      send-community extended        

Note: In this lab, Site-1 is in ASN 65000, Site-2 is in ASN 64000

Let's move on to the configuration of BGW-1. So far everything up to this moment should be very familiar. Here is where the new stuff comes in.

Let's enable the necessary features on BGW-1:

nv overlay evpn
feature ospf
feature bgp
feature fabric forwarding
feature interface-vlan
feature vn-segment-vlan-based
feature nv overlay        

Now, let's assign a site ID to our BGW:

evpn multisite border-gateway 1        

Note: All BGWs at the same site must have the same site ID. For Site-2 we should use another site ID.

Let's configure an OSPF and add some multisite configurations:


router ospf 1
  router-id 77.77.77.77

interface Ethernet1/1
#link to spine-1
  no switchport
  ip address 10.11.11.2/30
  ip router ospf 1 area 0.0.0.0
  no shutdown
  evpn multisite fabric-tracking

interface Ethernet1/2
#line to spine-2
  no switchport
  ip address 10.10.10.2/30
  ip router ospf 1 area 0.0.0.0
  no shutdown
  evpn multisite fabric-tracking
        

evpn multisite fabric-tracking detects whether one or all of the site-internal interfaces are available. As long as one of these interfaces is operational and available, the BGW can extend Layer 2 and Layer 3 traffic to remote sites.

interface Ethernet1/3
#link to RouteServer
  no switchport
  ip address 50.0.0.1/30
  no shutdown
  evpn multisite dci-tracking        

The DCI-tracking function in EVPN Multi-Site architecture detects whether one or all of the site-external interfaces are up and operational. If one of the many interfaces remains up, the site-external interfaces are considered working, and the BGW can extend Layer 2 and Layer 3 services to remote sites.

Since the VTEPs from Site-1 leaves aren't forming VXLAN tunnels directly to site-2 but rather to BGW we'll need to set up an NVE interface on the BGW. The configuration is somewhat similar to an NVE interface we've configured on the Leaf-1.

vlan 10
  vn-segment 10
vlan 20
  vn-segment 20
vlan 30
  vn-segment 30
vlan 100
  vn-segment 100

vrf context KDV
  vni 100
  rd auto
  address-family ipv4 unicast
    route-target both auto
    route-target both auto evpn

interface Vlan100
  no shutdown
  vrf member KDV
  ip forward

interface nve1
  no shutdown
  host-reachability protocol bgp
  source-interface loopback0
  multisite border-gateway interface loopback100 #VIP of one or more BGW
  member vni 10
    multisite ingress-replication
    ingress-replication protocol bgp
  member vni 20
    multisite ingress-replication
    ingress-replication protocol bgp
  member vni 100 associate-vrf        

We'll need at least two loopbacks to get the NVE working. Loopback0 will be acting as a source interface for the VXLAN tunnel and loopback100 is used for VIP purposes in case we have more than one BGW within one site. Essentially, the VIP address will be our next-hop for EVPN routes whereas loopback0 will be used for BUM traffic and a source of data plane traffic.

interface loopback0
  ip address 77.77.77.77/32
  ip router ospf 1 area 0.0.0.0

interface loopback100
  ip address 100.0.0.1/32
  ip router ospf 1 area 0.0.0.0        

Last but not least is the BGP configuration

router bgp 65000
  address-family ipv4 unicast
    redistribute direct route-map PERMIT
  address-family l2vpn evpn
    retain route-target all
  neighbor 4.4.4.4 #l2vpn evpn peering with SPINE-1
    remote-as 65000
    update-source loopback0
    address-family l2vpn evpn
      send-community
      send-community extended
  neighbor 5.5.5.5 #l2vpn evpn peering with SPINE-2
    remote-as 65000
    update-source loopback0
    address-family l2vpn evpn
      send-community
      send-community extended
  neighbor 50.0.0.2 #ipv4 peering with RS
    remote-as 63000
    address-family ipv4 unicast
  neighbor 99.99.99.99 #l2vpn evpn peering with RS
    remote-as 63000
    update-source loopback0
    ebgp-multihop 5
    peer-type fabric-external
    address-family l2vpn evpn
      send-community
      send-community extended
      rewrite-evpn-rt-asn

route-map PERMIT permit 10        

Let's do some explaining here.

rewrite-evpn-rt-asn is used to change the ASN portion of RT of the incoming BGP update. Without this feature, our leaves wouldn't import any routes since we expect the route target to be 65000:10 (for VLAN 10) and 65000:100 for VLAN 100 (L3 VNI), but Site-2 is in ASN 64000. With rewrite-evpn-rt-asn, we're able to resolve this issue.

peer-type fabric-external is used to define the site-external BGP peering session. This is how we know that our peer is external to our fabric.

Our BGWs and RS must somehow figure out how to reach each other's loopbacks. Route-map PERMIT is used to advertise the loopbacks into the IPv4 session with RS. I could've used any IGP instead of IPv4 BGP.

The final step is to configure the Route Server.

Let's enable the necessary features:

nv overlay evpn
feature bgp
feature nv overlay        

We'd want to keep the next hops unmodified for L2VPN EVPN updates which isn't going to work by default since we're using eBGP (RS is in AS 63000)

route-map NEXTHOP permit 10
  set ip next-hop unchanged
route-map PERMIT permit 10

interface loopback0
  ip address 99.99.99.99/32

router bgp 63000
  address-family ipv4 unicast
    redistribute direct route-map PERMIT
  address-family l2vpn evpn
    nexthop route-map NEXTHOP
    retain route-target all
  neighbor 50.0.0.1 #IPv4 peering with BGW-1
    remote-as 65000
    address-family ipv4 unicast
  neighbor 77.77.77.77 #L2VPN EVPN peering with BGW-1
    remote-as 65000
    update-source loopback0
    ebgp-multihop 5
    address-family l2vpn evpn
      send-community
      send-community extended
      route-map NEXTHOP out
      rewrite-evpn-rt-asn
  neighbor 88.88.88.88 #L2VPN EVPN peering with BGW-2
    remote-as 64000
    update-source loopback0
    ebgp-multihop 5
    address-family l2vpn evpn
      send-community
      send-community extended
      route-map NEXTHOP out
      rewrite-evpn-rt-asn
  neighbor 150.0.0.1 #IPv4 peering with BGW-2
    remote-as 64000
    address-family ipv4 unicast        

This is it. A similar configuration is expected for Site 2.

Verification.

Let's look from Leaf-1's perspective if BGP sessions are established.

The peerings with spines are established and we receive evpn routes.

Next, let's look at the mac address table

Whatever is learned from Site-2 is coming from an NVE peer 100.0.0.1 and we also see that our NVE neighbors are up.

Leaf 3 and 4 anycast IP of their loopback is 9.9.9.9. They belong to site 1

Let's test connectivity.

Host-1 belongs to Site-1, Leaf-1, VLAN 10, vn-segment 10, with the IP address of 192.168.10.11/24.

Host-7 belongs to Site-2, Leaf-5, VLAN 10, vn-segment 10 with the IP address of 192.168.10.14/24

The ping is working. here are the details of frames captured on the DCI link from Site-1 to Site-2.

100.0.0.2 is an IP address of loopback100 on BGW-2
100.0.0.1 is an IP address of loopback100 on BGW-1

As we see the bridging is working fine.

Let's try the L3 connectivity.

Host-1 belongs to Site-1, Leaf-1, VLAN 10, vn-segment 10, with the IP address of 192.168.10.11/24

Host-8 belongs to Site-2, Leaf-1, VLAN 20, vn-segment 20, with IP address of 192.168.20.14/24

VNI 100 is the one assigned to IP-VRF for routing purposes

Let's take a look at how EVPN Route Type 2 is advertised across the multisite fabric.

For bridging (L2VNI) we're getting the following Update message:

Nexthop is a VIP of BGW-1.
For bridging (L2VNI) only one route target is advertised whereas for routing (L3VNI) an additional RT of IP-VRF is advertised

Here is how L3VNI routes are advertised to another site.

Two route targets are necessary for the inter-vlan routing purposes
RT 64000:100 - belongs to L3 VNI(IP-VRF) and RT 64000:20 belongs to L2VNI (MAC-VRF)


That's all for today.

Thank you for reading and I hope you've enjoyed this article.




Gianrico Fichera

Senior core network engineer

10 个月

Thank you for this excellent document

Idil Fitriyadi

Telecom System Engineer | CCNA | FEED | FAT/SAT/Commissioning

1 年

Thank you for sharing

要查看或添加评论,请登录

Ivan B.的更多文章

  • VxLAN/EVPN to MPLS/SR Handoff

    VxLAN/EVPN to MPLS/SR Handoff

    Hello everyone, While browsing the internet searching for a configuration guide for VxLAN/EVPN to MPLS/SR Handoff I was…

    1 条评论

社区洞察

其他会员也浏览了