登录查看更多内容

Too Much Redundancy

Peter Welcher

发布日期: 2022年3月11日

[A light topic for a Friday posting ...]

Networks evolve over time.?

We’re all familiar with technical debt. One form of that is that old WAN links don’t go away overnight. Although if they cost a lot, removal should be expedited. But sometimes it’s hard to do that. With datacenters, the problem is often old equipment that just can’t be replaced or removed for (reasons). So you end up with an old server or two as the only thing holding up phasing out the ancient datacenter. In some cases, for years.?

This is a common problem when migrating to a new datacenter. Old facilities or costly mid-city real-estate are two reasons for datacenter migrations. Improving reliability by moving to a CoLo with better power, cooling, security, etc. is another reason. And getting your racks out of a fugly large closet with ducts and pipes etc. is a darn good reason.?

In the meantime, however, you can end up with unnecessary WAN links. Site connections tend to be homed to the datacenter. With two datacenters, network folks often home sites to both. With migration to a new datacenter interruptus (change of plans), you end with even more connections, often of three vintages (to the oldest, to the new datacenter, and to the second new datacenter).?

Here’s an adapted real-world diagram involving datacenters.

Site A is the original site. When Site B was added, they connected it as shown in black. The top L3 switches may have been added later as WAN routers.?

When site C was added, the black connections were added as shown.?

For (reasons), Site D was more recently added, and eventual phase-out of Sites A and C planned, subject to change. I put the names of B and D in red to emphasize that they are the “new core”.?

There’s actually more stuff going on, like gradually phasing in new core switches at B and D, but the above seems adequate for the points I wish to make.

Challenge:?Predict the routing under various failure scenarios.?

If you look closely, there are some links that might reasonably be phased out. Which would you remove??

Perhaps some red coloring will help.

See them now??

Why those? Well, my thought process is that sites A and C should be dual-connected to B and D, the new cores.?

Also, eliminating the red links provides better predictability of traffic flows, and troubleshooting. Yes, that is probably not clear from the above diagram.?

Re-drawing the diagram makes that clearer:

There’s still a lot of redundancy, but the structure is clearer. Sites 1 and 3 have dual links to each of the other two. And the hub two sites also have direct links to each other.?

I’ve seen something like this at another organization. Two sites in California, one in the mid-West, and one on the East Coast. The “main” two were subject to discussion. It turned out there was one clear main site in California and the other nearby site might be folded into it. The other “main” site was somewhat of a toss-up, but it turned out vacating the mid-West site was on the radar, for substantial cost savings.?

The remaining question now is whether to treat the East Coast site as a backup datacenter, or to shift to a cloud-centric approach. Geography / latency was a consideration.?

In yet another organization, the core is 6 routers, 2 at each of three sites, one of which is a CoLo. A fair number of other sites dual-connect to the core via diverse providers. That’s workable. Moving servers/apps etc. is a major consideration that likely will prevent reducing the core to 4 devices at 2 sites.?

领英推荐

CtrlS and EEC enter into Memorandum of Agreement for…

CtrlS Datacenters 1 年前

Virtualization For Enterprises and Hobbyists in 2024

Luke Congdon 1 年前

Innovations in SD-WAN: Enhancing Flexibility in…

I SQUARE TEK 11 个月前

The Design Principle

That section title is perhaps a bit overly grandiose, but …?

One common strategy is to pick two “hub” sites, and connect other sites to those two. Eliminate other connections unless you have major traffic flows or other reasons that justify the costs.?

If your network is geographically wide-spread, then perhaps do that per-continent, i.e. two hubs in the U.S., etc. Or more, depending on number of devices in various regions.?

This aligns with CoLo-centric or Cloud-centric networking, as I’ve discussed elsewhere. You can use VPN to connect sites to say two cloud hubs (or more for larger geographic presence). Alternatively, connect at least major sites to nearby CoLo’s for agile NaaS, etc. and cloud connectivity.?

Recycling Design Ideas

You may be thinking “WAN, that’s so old-fashioned”. Well, yes. However, I’ll note that the above design issues recurred in the context of dual CoLo facilities. And that remains an active design approach if you’re using dual (or more) CoLo’s as WAN and/or SD-WAN hubs, in part since CoLo to Cloud NaaS provides agility that may not be available for the last mile.?

The even newer variant is using cloud provider locations as hub sites.?

Think Hierarchically!

If you look at this a bit differently, it’s just hierarchical networking.?

Have a core, connect “everything” dually to the core.?

If your network is global, have a global core, with dual core members in each continent, say. And dual-connect (if possible) sites within a region (continent or whatever) to the continent’s core. That loosely describes a couple of WAN or SD-WAN topologies I’ve seen. Internationally, putting core switches into CoLo’s helps with availability and cost of the long-haul fiber connections.?

Conclusions

I’m not convinced the diagrams above told the story perfectly, but that’s the real world.?

There are two main conclusions:

Impose hierarchy on network designs, avoiding too much redundancy.
If you can’t look at the diagram and describe the routing in a simple way, re-design it! Random WAN meshes mostly went away 20-30 years ago! And with that, traceroute still can be useful, but you also can predict what you think the path should be!?

Comments

Comments are welcome, both in agreement or constructive disagreement about the above. I enjoy hearing from readers and carrying on deeper discussion via comments. Thanks in advance!?

Hashtags:?#NetCraftsmen #CiscoChampion?

Disclosure statement

Twitter:?@pjwelcher

LinkedIn:?Peter Welcher

Jo?l Fran?ois

Senior IP Engineer | CCIEx2 (RS,SP) #55635 | CCDE in preparation

3 年

Eventually also consider transport type in the design (Eg: dark fiber / shared risk link group with xWDM, etc…)

Palash Barua

SDN/IP/MPLS/Cloud-Native/Solution-Architect/Automation/Linux/DB (CCIE Enterprise # 60345)

3 年

Nice one sir ??

查看更多评论

要查看或添加评论，请登录

Peter Welcher的更多文章

Introduction to Microsegmentation

2025年3月18日

Introduction to Microsegmentation

This blog begins an introductory series of moderately long blogs, covering key aspects of Microsegmentation and Zero…

3 条评论
Pete’s Take: Catchpoint at Cloud Field Day 22

2025年3月11日

Pete’s Take: Catchpoint at Cloud Field Day 22

Tech Field Day always produces such great technical content! However, it can be a challenge keeping up with it due to…
AI Ate My Blog on RoCEv2

2025年2月27日

AI Ate My Blog on RoCEv2

I acknowledge I’ve been a blog technology summarizer for quite a while. It served to help me broaden/solidify my skills…
AI Datacenter Switch Math

2025年2月25日

AI Datacenter Switch Math

Author: Pete Welcher, Coauthor: Brad Gregory This is blog #3 in a small series about Networking for AI Datacenters…
AI Requirements for Datacenter Networking

2025年2月18日

AI Requirements for Datacenter Networking

Author: Pete Welcher. Coauthor: Brad Gregory.
Quick Takes #2, February 2025

2025年2月12日

Quick Takes #2, February 2025

I’m working on some longer blogs that I hope to be able post in the next week or two. In the meantime, lots of exciting…
Quick Takes: February 2025

2025年2月4日

Quick Takes: February 2025

I’ve got some longer technical blogs in the works. For this week, it’s time again for some of my “Quick Takes”:…
Pete’s Take: Pain Points in Networking and IT

2025年1月28日

Pete’s Take: Pain Points in Networking and IT

It’s a new year, so time to look at how Networking and IT have been evolving. Ignoring the AI elephant in the room.

1 条评论
Pete’s Take: Pondering NetOps/AIOps Strategy

2025年1月22日

Pete’s Take: Pondering NetOps/AIOps Strategy

What’s new in NetOps, including AIOps, and where are things heading? Some thoughts ..

1 条评论
Pete's Take: AI/ML and Error

2025年1月14日

Pete's Take: AI/ML and Error

Artificial Intelligence (AI) has certainly received a lot of press lately. And achieved new levels of hype.

See all articles

Too Much Redundancy

Peter Welcher

领英推荐

The Design Principle

Recycling Design Ideas

Think Hierarchically!

Conclusions

Comments

Peter Welcher的更多文章

社区洞察

其他会员也浏览了

VXLAN and EVPN for Datacenter

Cisco ACI Multi-Pod & Multi-Site Options

Telco Wars: E7

SD-WAN: Your Roadmap to SASE

Cooling AI Datacenters consumes water, right? Maybe not.

Legrand Solutions for Futureproof Datacenters

Unleashing the Power of Colocation

vPC Migration from One Cisco Nexus Model to a Different Model

Fundamentals of VXLAN (Virtual Extensible LAN)

How to tackle NFV AAA deployment challenges

领英推荐

The Design Principle

Recycling Design Ideas

Think Hierarchically!

Conclusions

Comments

Peter Welcher的更多文章

Introduction to Microsegmentation

Pete’s Take: Catchpoint at Cloud Field Day 22

AI Ate My Blog on RoCEv2

AI Datacenter Switch Math

AI Requirements for Datacenter Networking

Quick Takes #2, February 2025

Quick Takes: February 2025

Pete’s Take: Pain Points in Networking and IT

Pete’s Take: Pondering NetOps/AIOps Strategy

Pete's Take: AI/ML and Error

社区洞察

其他会员也浏览了

VXLAN and EVPN for Datacenter

Cisco ACI Multi-Pod & Multi-Site Options

Telco Wars: E7

SD-WAN: Your Roadmap to SASE

Cooling AI Datacenters consumes water, right? Maybe not.

Legrand Solutions for Futureproof Datacenters

Unleashing the Power of Colocation

vPC Migration from One Cisco Nexus Model to a Different Model

Fundamentals of VXLAN (Virtual Extensible LAN)

How to tackle NFV AAA deployment challenges