L2 is BAD :(
rara avis in terris nigroque simillima cygno

L2 is BAD :(

I once had the opportunity to design the network architecture of a new data center.

The main idea I wanted to convey was to minimize the number of L2 domains and their size. For me, this approach is obvious for the last few years. And I was genuinely surprised that it is not obvious to everyone.

So here we go…

First of all - "L2 is a single failure domain", a problem at one point can easily spread to the entire datacenter. The most common problem is broadcast storms and mailformed broadcast frames. As representative example of what it can lead to, we can recall a case from far 2018 on the network of a large American provider Centurylink, when a problem in a single L2-domain suspended the 911 service in several states. More details can be read the document

Single L2 makes scaling difficult, the larger the datacenter - the more of broadcast traffic on the network.

Yes, there are storm control mechanisms, but they drop everything indiscriminately - it is impossible to understand where legitimate traffic is and where it is not - ARP requests stop working, etc

Other than that, "problems" on L2 are hard to troubleshoot, and once something has happened it's usually too late.

Another disadvantage is the lack of spoofing protection mechanism, a MAC-address can be "accidentally" assigned to the one already used somewhere and we will get a situation with MAC-address flapping - and as a consequence disabling the mac learning mechanism in a particular VLAN on modern datacenter switches.

Is using EVPN\VXLAN the solution?

EVPN certainly reduces the amount of broadcast traffic and has some sort of loop protection mechanism, but nevertheless L2 remains L2. In addition:

  • EVPN will not protect against mailformed broadcast packets;
  • implementation of VXLAN/EVPN in the network operating system code is much more complex than the implementation of simple routing, as a consequence, it has a larger code base, and as a consequence, it increases the potential number of bugs;
  • When dealing with "new" vendors of network hardware, it is not clear in advance what difficulties in operation we will encounter;
  • in spite of the fact that EVPN is an open standard, but all the leading companies refrain from interop vendors within the framework of a single vendor - it is not clear who to blame when the problem is "at the junction". Because of this, there is a lock-in to a single vendor within a single site. I'd love to hear stories about EVPN\VXLAN-factory inter-vendor control-plane ;)
  • VXLAN is still "insecure", segmentation of what's inside is impossible at the switch level - we "can't see" what's in the tunnel. We need to strip VXLAN headers "somewhere" and filter traffic (if necessary).
  • Ethernet wrapped in UDP, wrapped in IP, wrapped in another Ethernet is complicated to troubleshoot. CRC errors in the original frame "fly" through the fabric, are not dropped anywhere and reach the destination in their original form.?(Of course, if cut-through switches are used in the factory).

What's on offer?

Use pure L3 routing. No overlay in the fabric . All overlay should be inside the servers - in SDNs.

By only needing routing, we reduce the "complexity" of the network - thus reducing the number of potential problems. Fewer features means fewer places where problems can occur.

It is much easier to realize interaction between devices of different vendors using only routing than EVPN - we are not tied to the vendor.

The ideal scheme is generally L3 up to the server where the services are located. The scheme covers fault tolerance, bandwidth extensibility, and scalability - the service (e.g. web site) hangs on a dummy\loopback interface, IP-address is announced to the physical network by some dynamic routing protocol (BGP as a defacto standard) - through each link, for example, and the announcements go upstairs

????

PROFIT!

Mikhail Sokolov

Middle network engineer

1 年

А что в итоге? Получилось построить ДЦ на чистом л3?

回复

要查看或添加评论,请登录

Roman Pomazanov的更多文章

  • FRRouting. No clickbaits.

    FRRouting. No clickbaits.

    No preface, no chit-chat. Let’s go.

    19 条评论
  • The Birth of Bad Architecture

    The Birth of Bad Architecture

    Hard times create strong people. Strong people create crappy architecture (c) Plato.

  • Let's IaC some network labs

    Let's IaC some network labs

    So, we all know that IaC is good, and to a certain extent abstract. So what can IaC do for our brother network…

    4 条评论
  • All MTUs matter

    All MTUs matter

    The story is relevant for BGP protocol in general, but it came up in my EVPN\VXLAN case, so the example will be about…

  • Is it easy to replace a pair of network switches? (18+)

    Is it easy to replace a pair of network switches? (18+)

    Let's imagine a certain "datacenter" with a legacy network. Let's call it Meridian.

    4 条评论
  • Solving some Cisco ISE issues

    Solving some Cisco ISE issues

    Today we are going to talk about Cisco ISE cluster recovery. First of all, I want to say that everything described here…

社区洞察

其他会员也浏览了