NETWORK RESILIENCE: PROTECTING NG911 – OUR DIGITAL LIFELINE
Copyright 2024 Mark J. Fletcher - All Rights Reserved

NETWORK RESILIENCE: PROTECTING NG911 – OUR DIGITAL LIFELINE

AN AUDIO VERSION OF THIS BLOG IS AVAILABLE HERE:

https://fletch.blog/2024/04/21/network-resilience-protecting-ng911-our-digital-lifeline/

Reviewing recent network outages that prevented callers from reaching 911 services across widespread geographies, I found that they were NOT REALLY A 911 FAILURE. They appeared to be failures in ACCESS to 911 services.

With that in mind, let’s examine various operational policies and best practices for network diversity from a perspective that balances technical savviness with real-world pragmatism. Then, we can throw in the added consequences when dealing with a critical Life Safety Emergency Communications network infrastructure like the ESInet, akin to Public Safety’s own private broadband internet. To a hacker, the ESInet would be considered the ‘Holy Grail’ of targets for nearly every Black-hat. What does this leave you with? A massive target for terrorism that is complex yet potentially delicate. The most important question is, “How do we best protect this critical resource?”

The answer to that question is RESILIENCY and REDUNDANCY.

Entry/Exit Redundancy for Network Facilities

Imagine the NG911 network is a large building covering many floors with different companies providing various services on different floors. Your job is as a consultant.

?tasked with creating an emergency exit strategy for the building. You wouldn’t design only one exit path; you’d plan several to account for various scenarios, as shown in Figure A:

Network dual entrances work on the same principle. They provide multiple physical paths for connectivity into a building ensuring that if one is compromised—say by construction damage—there’s an alternate pathway for circuits to enter the facility to keep the data flowing, as shown in Figure B, above.

Leveraging Multiple Carriers and ISPs

Employing alternate carriers is akin to having multiple energy suppliers for a smart grid. This redundancy ensures that if one ISP experiences an outage, another can pick up the slack without interrupting service. It’s about not putting all your eggs in one basket and ensuring business continuity. Utilizing multiple carriers, each using multiple paths, can provide an exceptional level of diversity from a physical infrastructure disruption as seen in Figure C. Eliminating single points of failure in the infrastructure, despite what failure occurs and where – allows a desired level of service to be provided. The overall capacity of that service becomes something that can be engineered based on need, and dependant on the business requirements. The design depicted in Figure C offers dual entrances to the facility in conjunction with dual carriers on diverse paths. This allows a complete failure to occur at any point in the network, still providing diverse survivability.

Physical Path Separation

As noted, the strategic separation and isolation of circuit paths—avoiding shared conduits and telephone poles— along with provider diversity, can be an effective risk mitigation tactic. This is similar to any urban utility planning exercise where critical infrastructure is spread out to prevent a single failure from cascading into a city-wide blackout.

Digging Deeper: Diversity Within the Carrier’s Realm

Additional levels of diversity can be achieved by extending these principles into the carrier’s infrastructure, which may be available for the asking. If your services are life safety critical, you may be able to establish a contract with guaranteed diversity of the essential components associated with your facilities; potentially having different data centers, like a distributed cloud environment, allowing for failover and load balancing abilities. This can be a significant advantage because an issue within one segment won’t hinder the overall network, as a backup location is always there and ready to take over.

Component Level Distribution Strategy

Depending on the criticality of your data, at the micro-level, it may be worth distributing circuits over physically unique network switches, ideally across various rooms and even floors within a carrier facility. Although this requires monitoring and management, it allows you to achieve any level of fault tolerance you deem necessary.

Balancing Cost and Redundancy

Implementing such a layered approach to network architecture requires significant investment—not just in terms of capital expenditure but also in the ongoing operational costs. It’s a complex balancing act akin to deciding whether to buy insurance for your smartphone. You weigh the upfront and potential costs against the value of the protected asset. With public safety networks, the mathematical calculation becomes more complex when dealing with the possible loss of life, which is one of the potential impacts since no amount of cost can be compared to a life.

Checking Your Work with Regular Audits

Ensuring compliance with diversity guidelines is by no means a one-off task. It’s an ongoing process, like the regular updates and patches required to keep software secure. Without vigilant auditing, a single network change could undermine the diversity strategy, introducing vulnerability where there was none. However, it is essential to remember that rules and audits only report on compliance. Without a pre-negotiated penalty for violations, there is no incentive for any participant to maintain compliance or quickly rectify an issue if it’s brought to their attention.

Economic Justifications for Network Diversity

The extent of network diversity implemented is often proportional to the criticality of the services it supports. For a data center hosting mission-critical applications, the cost of any downtime may be so exorbitant that the investment in network diversity is a no-brainer. However, a less intensive diversity plan might be more economically rational for a less critical application that can suffice with a cloud-based backup alternative.

The Reliability Analysis – Beyond the Scope?

Finally, the reliability factor of network diversity—assessing the statistical likelihood of system failures—requires an intense dive into potentially complex mathematical models and theories. This advanced analysis helps architect a diverse network optimized for the highest possible uptime within an acceptable cost framework.

Understanding network diversity at this stage involves appreciating the nuances of network engineering and its critical role in business resilience, which becomes increasingly relevant as our dependency on digital infrastructure grows.

For now, a simple rule to follow is to physically trace the path of each data and communications circuit through your network. Where you run into multiple facilities utilizing a single component, design what it would cost you to use multiple devices. But don’t stop there. You need to figure out how you’re going to share traffic across both and what that management model will look like.

In the end, you’ll possess the information you need to make an intelligent decision from a physical hardware perspective as well as a monetary commitment.

If you enjoyed this article, please follow me on X @Fletch911 , and be sure to check out my profiles on LinkedIN and Facebook.


Mark J. Fletcher, ENP VP Public Safety Solutions – @911inform www.911inform.com | 833-333-1911

Thrilling insights into the backbone of NG911 systems Mark J. Fletcher, ENP

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了