L2 is BAD :(
I once had the opportunity to design the network architecture of a new data center.
The main idea I wanted to convey was to minimize the number of L2 domains and their size. For me, this approach is obvious for the last few years. And I was genuinely surprised that it is not obvious to everyone.
So here we go…
First of all - "L2 is a single failure domain", a problem at one point can easily spread to the entire datacenter. The most common problem is broadcast storms and mailformed broadcast frames. As representative example of what it can lead to, we can recall a case from far 2018 on the network of a large American provider Centurylink, when a problem in a single L2-domain suspended the 911 service in several states. More details can be read the document
Single L2 makes scaling difficult, the larger the datacenter - the more of broadcast traffic on the network.
Yes, there are storm control mechanisms, but they drop everything indiscriminately - it is impossible to understand where legitimate traffic is and where it is not - ARP requests stop working, etc
Other than that, "problems" on L2 are hard to troubleshoot, and once something has happened it's usually too late.
Another disadvantage is the lack of spoofing protection mechanism, a MAC-address can be "accidentally" assigned to the one already used somewhere and we will get a situation with MAC-address flapping - and as a consequence disabling the mac learning mechanism in a particular VLAN on modern datacenter switches.
领英推荐
Is using EVPN\VXLAN the solution?
EVPN certainly reduces the amount of broadcast traffic and has some sort of loop protection mechanism, but nevertheless L2 remains L2. In addition:
What's on offer?
Use pure L3 routing. No overlay in the fabric . All overlay should be inside the servers - in SDNs.
By only needing routing, we reduce the "complexity" of the network - thus reducing the number of potential problems. Fewer features means fewer places where problems can occur.
It is much easier to realize interaction between devices of different vendors using only routing than EVPN - we are not tied to the vendor.
The ideal scheme is generally L3 up to the server where the services are located. The scheme covers fault tolerance, bandwidth extensibility, and scalability - the service (e.g. web site) hangs on a dummy\loopback interface, IP-address is announced to the physical network by some dynamic routing protocol (BGP as a defacto standard) - through each link, for example, and the announcements go upstairs
????
PROFIT!
Middle network engineer
1 年А что в итоге? Получилось построить ДЦ на чистом л3?