Datacenter Downtime Real Costs

Datacenter Downtime Real Costs

Welcome to the latest issue of IT3, where our goal is to help educate the masses on current data center issues.? This issue focuses on the real costs of data center downtime and some real-world examples you might need to consider for your organization.

In the digital age, the data center is the heartbeat of the businesses. ?Whether you have an on-premise facility, use a colocation provider, or leverage the cloud, these facilities house the digital infrastructure that keeps our modern world running smoothly. ?The answer is simple but staggering: the real costs of downtime are immense, both in financial and reputational terms.

Financial Consequences

First, let’s define financial consequences.? Financial consequences can be measured in the bank account of the business.? Soft dollars, such as salaried staff losing an hour of downtime, are not included in this category as the cost to the business does not change.? The most common financial consequences are as follows:

Lost Revenue: Perhaps the most immediate and obvious of metrics is the loss of revenue. ?But is the revenue truly lost, or is it delayed?? Even a brief outage can disrupt e-commerce platforms, customer support, and financial transactions, leading to substantial revenue losses.? For example, a Wall Street broker might lose millions in a few minutes if their data center is offline, and this revenue will not likely be recovered.? On the other hand, a data center outage that issues licenses and permits would not experience lost revenue.? The entity would still get the revenue, but it may be a few days later than expected.?

Operational Expenses: People will claim increased operational expenses during a data center outage. ?Employees still need to be paid, vendors will still deliver supplies, and service providers still perform their roles.? However, staffing costs would only affect hourly staff.? Typical increases in operational costs are the incremental spending required by vendors and contractors to provide resources and equipment to restore services in the data center.

Data Recovery and Restoration: When a data center goes down unexpectedly, there's a risk of data corruption or loss. ?Data recovery and restoration can be a time-consuming and expensive process. ?Organizations may need to invest in specialized services to retrieve or recreate lost data. One common misperception is, “My data is in the cloud, so it’s protected.” This is often not true for some of the most widely used platforms.? Take Office 365, for example.? Microsoft guarantees availability and uptime but does not guarantee the ability to recover your data.? To guarantee data recovery, you will need third-party tools and services to backup and store data offline (not in the cloud).? Multiple vendors exist that can guarantee the ability to backup, restore, and recover your data, but it is an additional service.

Regulatory Penalties: Data center downtime can result in severe regulatory penalties in heavily regulated industries such as healthcare and finance. ?Failing to meet service level agreements (SLAs) or compliance requirements can lead to fines, legal action, and even loss of an organization’s licensing/certification.

The Hidden Costs

Hidden costs are less obvious and much more difficult to measure.? While some organizations will tout metrics such as “$xx,xxx per minute of downtime,”? your mileage will vary based on industry, the length of the outage, and whether or not the outage garners the press’ attention.

Customer Churn: Organizations are hyper-sensitive to disruption and public opinion, and recurring downtime will open the door to alternative providers. ? The cost of acquiring new customers to replace those lost during downtime can be substantial.

Reputational Impact: The effects of downtime can extend far beyond the immediate incident. ?Businesses that suffer frequent outages may find it challenging to attract investors, secure partnerships, or expand into new markets. ?For service providers, recurring outages will impact a company's reputation. ?In today’s world of social media, negative publicity and false information can spread like wildfire, amplifying the situation's impact.

Preventing Outages

So how does an organization prevent outages?

?

  • Redundancy and Failover Systems: Implementing redundant systems and equipment ensures that critical services remain available.? This can be an expensive option, so make sure you understand the availability expectations from the business. For example, a client recently did not understand the power thresholds required to fail over electrical loads between two UPS systems.? Generally, a UPS should never be loaded more than 80% of its rated capacity.? in a failover configuration, each UPS should not be loaded more than 40%.? In this example, the client loaded both UPS units to 47%.? If one unit fails, the inrush of power from the failed UPS would immediately jump the surviving UPS to 94% of its rated capacity (or higher), which could cause the second UPS to fail.? The result was two dead UPS units and an outage.
  • Regular Maintenance, Monitoring, and Training: Regularly scheduled maintenance and proactive systems monitoring will identify problems before they cause an outage.? Organizations often lack documented standards, maintenance, and emergency operations procedures.? Even when they exist, staff often do not refer to the manuals for reference, resulting in outages caused by human error.? Educating and training employees on standard procedures can eliminate common errors that lead to downtime. For example, a heavy lightning storm is on the town's edge and headed for the data center.? Many clients would simply choose to ride it out and see if the power drops.? The smart decision is to immediately go to generator power and provide clean power to the data center.? This will protect the UPS units from multiple power sags and spikes, extending the units' life and eliminating the risk of a UPS failure.
  • Disaster Recovery Plans: Have comprehensive disaster recovery plans and test them annually to ensure you can recover.? In over twenty years, Excipio has never had a client that could recover from a DR event.? Why?? DR plans are often written by individuals dedicated to supporting that system, and these key individuals may not be around during the outage.? Too often, a written step in a DR plan represents ten or more steps in the head of the system expert.

Call to Action

Due to the complexity of data centers and IT systems, organizations may struggle to understand their options.? So here are some suggestions:

  • Get an Assessment: If you have any doubt about the design, capacity, or operations of your data center, leverage a third party to come and provide an objective review and options to fix the issue(s).
  • Develop a Strategy: Is owning your data center a comparative advantage to your business?? Know what it costs to operate your environment and the costs to outsource to external service providers.? Moving to an external provider will always cost more money long-term, but their scale and experience should eliminate outages. ?However, moving to an external provider is no small task.? Moving workloads requires detailed application dependency mapping, scheduled outage windows, physical move labor, external consultants, and project managers.? Planning and executing a data center move is complex and expensive.? As your staff may not have this expertise, rely on a consultant to help your organization understand the technical and financial options available.


Scan to set meeting



要查看或添加评论,请登录

Excipio的更多文章

社区洞察

其他会员也浏览了