Data Centers Unveiled: How Society Hangs by a Thread
Dr. Eric Woodell
World's #1 expert in data center resilience. I audit and certify colocation facilities, ensuring secure, continuous operations—insured by Lloyd's of London.
In a very real way, data centers have now become the digital backbone of our modern economy.? Every online transaction you make is fed through a data center, usually multiple data centers spread across the country.? Your medical records, financial records, social media, telecom data from your cell phone, everything is fed through and stored in them.? For large corporations, everything you do with a computer in an office building is the same way.? If you watch video streaming services, whether from Amazon, Hulu, or YouTube, same thing.? You get the idea.? If you’re not living in a cave, your life is affected by data centers.
Correspondingly, Information Technology (IT) has become so prolific, with everything from shipping and receiving functions, internal email and customer-billing functions, that the IT systems are now the lifeblood of every large company.? This includes petroleum and utilities companies, food delivery companies, big-box stores…? Every large company in every business sector relies on IT, and thus, they depend on data centers.
Data centers are specially constructed buildings with special power and cooling systems to protect the information technology (IT) assets, along with special fire-suppression and other life-safety systems and have high levels of physical security.? They typically include dual power feeds, battery back-up systems, diesel generators and dedicated cooling systems relying on local water supplies.? They’re very expensive to build- $200-300 million is the low end- and they consume huge amounts of electrical power and water to function.? I’ve often stated that data centers are “energy hogs.”
Because they are so expensive to build, operate, maintain and staff, as well as taking years to build, many companies resort to “colocation” services.? “Colo” companies are those who build large data centers with wide-open floors, which can be segregated into caged areas that individual companies can lease out on an as-needed basis.? The advantages for enterprise IT organizations of large companies (where data centers are not part of their core business) are obvious; no upfront construction costs, a pay-as-you-go approach, scalability and speed to deployment is measured in weeks, not years.
Colocation Market Growth
Today, of the ~5300 data centers across the United States, ~3000 of them are owned by colo companies such as CyrusOne, Cyxtera, Digital Realty, Equinix, Switch, QTS and Vantage.? The colocation market in the USA is estimated to be $71.27 billion in 2024, and growing at a compound growth rate of 14.1%.
The Shark-Tank billionaire investor Kevin O’Leary summarized this market in a recent Fox News interview:
“In development and real estate right now, the hottest asset class is very high-end data centers.? They cost anywhere from $2.5 to $3.5 billion each, they are very expensive, they require low [actually high] power, you need permits.? But most of the major institutions in the world need more data centers, and that’s why developers like me are doing this…”
The investor class moving into the data center market is now funding a construction boom, due to increased need from enterprise IT organizations demanding either increased IT capacity or additional redundancy.? The construction boom is also heavily influenced by the increased demand for artificial intelligence (AI).
Impacts on the Electric Grid
The colo market has grown massively from infancy in 2010, when power purchase agreement volumes from local utilities from local utilities totaled just 0.1 gigawatts (gW).? By 2021, it had reached 31.1 gW.
The 14% YOY demand growth, combined with the growth of ?electric vehicles occur at the same time the Biden Administration has been aggressively compelling electric power production companies to retire their fossil-fuel and nuclear power plants and replace them with renewable power facilities.
According the North American Electric Reliability Corporation (NERC), this will create a gap of 83 gigawatts (gW) of power.? In comparison, New York City consumes 5.5 gW.? The resultant shortfalls represent risks of brownouts or blackouts, where utility distribution companies must shut off power to customers to keep the grid from total collapse:
This means that during the heaviest demand periods (the high heat of summer or the dead of winter) the red states are most likely to see their power shut off.? It should be noted that electrical utility contracts with data centers typically stipulate that their power will not be interrupted unless absolutely necessary; residential customers have no such guarantees.? The predictable result of this is that power will be cut to homes when they need it the most, causing people to die from excessive heat, or freezing to death.
The Colocation Market and its Hidden Dangers
As I pointed out at the beginning of this article, data centers are the backbone of our economy, and the majority of them are owned by colocation companies, who lease out spaces to enterprise IT organizations of large companies who prefer to avoid the expense of building their own. The customers of colocation companies are represented by the following business sectors:
Because a colocation data center is like a special warehouse where caged areas can be installed, you could have dozens or hundreds of companies represented in a single location.? The failure of a single colocation data center, such as in a situation where the back-up power systems fail to respond to a utility power outage, could therefore affect all of the customers in that site; the ripple impacts are impossible to predict.
This means that each colocation data center must be meticulously maintained, and any deficiencies discovered in the IT “safety nets” for power and cooling must addressed immediately.? Like commercial passenger jets, they are fabulously well designed, but require stringent maintenance practices to provide safe and reliable service over the long run.
Enterprise IT Organizations Lack Agency When They Shift to Colocation Data Centers
Unfortunately, most IT organizations from large companies have no means to verify that the safety nets for their IT equipment are being properly maintained; instead, they rely upon “Service Level Agreements,” (SLAs) to provide a financial remedy in case of unplanned outages (for whatever reason) resulting in business interruptions.? The industry standard for financial remedy is credits for future service.
For those IT organizations who want some sort of reassurances that the facilities are being maintained according to industry standards, colocation companies hire auditing firms to certify that proper maintenance is being performed, as well as other protocols are being followed, for areas such as cybersecurity, physical security and access, not hiring convicted felons, etc.? This certification, called SOC-2, is issued by AICPA-trained auditors.? AICPA stands for the American Institute of Certified Public Accountants.? And the only requirement to be a SOC-2 auditor is that you have to be a Certified Public Accountant (CPA).
The problem is that most CPAs know next to nothing about cybersecurity, and absolutely nothing about data center engineering systems (which require decades to learn and truly understand).? It would be similar to your getting ready to board a 737 Boeing jet, and a CPA comes up to you at the boarding line and says, “don’t worry, I audited this company, THIS plane is safe!”? Would you bet your life on the validity of that pronouncement?? I wouldn’t.? Yet the IT organizations of large corporations all over the world believe in the validity of the SOC-2 (or its European equivalents), which is actually worthless.
The combination of reliance upon SLAs (which, if they have a serious outage that punches your company right in the face, promises you credits for more of the same) and a clearly fraudulent SOC-2, results in companies having blind faith in the colocation companies to deliver reliable, uninterrupted service.
This trust is not only misguided, but extremely dangerous, as a recent investigation from Hindenburg Research demonstrates:
What it boils down to is that when Equinix became a REIT, they decided to divert maintenance budgets to other accounts that appeared to make the company grow faster than everyone else in the market.? The resultant bump in stock prices appears to have been the incentive for this move, so that executives could cash out some $300 million in bonuses.
The reduced maintenance budgets were just enough to let vendors come in and do equipment checks, but any defects in the IT safety nets weren’t repaired, merely swept under the rug:
领英推荐
As they became even more aggressive in pushing their stock prices up, they just slashed their maintenance budgets even further:
Of course, the deficient safety nets resulted in outages to various IT clients, where the only cost to Equinix was future credits to the clients.
Paraphrasing the Fight Club equation:
Take the number of data centers in the portfolio, A. Multiply it by the probable rate of failure due to not performing maintenance, B. Then multiply the result by the cost of future credits to the client, C. A times B times C equals X. If X is less than the cost of doing maintenance, we don't do it.
The results of the poor maintenance of data centers have?had predictable results.? The standard system architecture in the data center industry is “Tier-III,” with an expected reliability of 99.982%.? Put more generally, that translates into the odds of failure each year at 1:5,555.
Compared to that performance standard, actual reliability is more than three orders of magnitude worse, at a rate of 1:5.5.
If the airline industry suffered “severe” outages as often as data centers, we would see a commercial passenger jet crash every day.
The outages have cost companies heavily, with those who rely most heavily on colocation data centers hit the hardest, according to the latest research data:
The same report clearly shows that maintenance mismanagement is the #1 cause of data center outages, as shown in the analysis here.
All of these outages have occurred, mind you, at a time when the national electric grid enjoys high reliability.? In such times, maintenance efforts often take a back seat to other more profitable projects, as safety nets are rarely needed.
But as the grid becomes more unstable, the importance of the backup systems when the power is interrupted become far greater.? Nobody thought about lifeboats on the Titanic, until it was too late.
What Happens From Here?
As grid reliability decreases, more data centers are going to get hit with utility interruptions, testing the backup systems.? Those that have been poorly maintained will fail, disrupting business operations, in some cases fatally.
These disruptions will have knock-on effects, many of which can be logically extrapolated; supply-chain disruptions, losses of customer or financial data, failures of logistics, additional utility failures in cascade failures (remote control of utilities often goes through data centers), losses of internet access or cellular service, and so on.
While we typically think of such scenarios at a high level, the effects of such colocation outages could easily manifest at the local level, especially in the logistics and transportation sectors, who are responsible for restocking grocery and big-box stores, gas stations, pharmacies, and pretty much every other retail outlet you can think of.
Ideally, the software used by companies to automatically shift from one data center to another when a site fails, and will do so in a seamless manner, invisible to the end-users.? Unfortunately, these systems are rarely tested because they sometimes fail, typically leading to the business interruptions the company was hoping to avoid altogether.
Similar circumstances exist for “cloud” companies as well.? Interestingly, a private conversation with one of the top IT professionals in the world on cloud computing, flatly stated that “95% of cloud computing configurations are done wrong; any deviations in the system can cause cascade failures, and most cloud people don’t even know it.”? My takeaway is that cloud computing is not a lifeboat for companies.
Are There Any Solutions?
In a word, “yes.”
If enterprise IT organizations force colocation companies to properly maintain their data centers, the need for additional locations to maximize redundancy and resilience is dramatically reduced, leading to reduced operational costs for IT clients and correspondingly greater profitability.? This can be accomplished by SLAs with “teeth,” as well as 3rd-party audits conducted by experienced professionals in the engineering field.? [For full disclosure, that is the purpose of my business.]
To the extent that AI is growing, the current GPU-based AI approach is doomed to failure, due to a hidden trap called the Von Neumann bottleneck. The AI systems being deployed now are like steam engines from two centuries ago; pushed to maximum efficiency, until something new came along which changed the entire industry, such as the diesel engine.? And the industry will retool to accommodate that new technology, because the efficiency gains and reduced costs will make it necessary.
Finally, and most importantly, grid instability will become more frequent and obvious, especially during severe weather conditions. At such times, data centers will be kept powered while the public will be expected to tolerate the utility outages, sometimes resulting in fatalities. The public backlash from this will be the ultimate deciding factor, forcing the industry to become more responsible to their local communities.
Data Centre Consultant, Chartered Engineer, Chartered IT Professional, Non-Exec, Standards Expert and Experienced Panel Chair
9 个月Richard Stacey @ Future-tech