Showe me a Tier/Rated-4 data center and I'll show you at least 10 Single Point Of Failures!

Showe me a Tier/Rated-4 data center and I'll show you at least 10 Single Point Of Failures!

Show me a Tier/Rated-4 data center and I’ll show you at least 10 Single Point Of Failures!

Tier/Rated-4 data centers are considered to be 'Fault Tolerant' meaning that there is no Single Point Of Failure (SPOF) which could have an impact on the critical IT load.?So how do I dare to claim that your Tier/Rated-4 data center, whether you reference Uptime or TIA-942, still has at least 10 SPOFs??

There is a very simple way to find out how many SPOFs you might have in your data center. Go to a meeting room. Invite all of your operational staff, including facilities, floor management staff etc, into the room. Now, count the number of people…… Yes, each and every one of your staff is a potential point of failure who could bring your critical IT down!!

In order to run a high-available, effective and an efficient data center you need to do three things right;

  1. have the appropriate facilities/IT infrastructure
  2. have appropriate operational policies/procedures/work instructions at the right maturity level
  3. have the right/competent people to execute these processes.

Recently, EPI conducted an industry wide survey to find out the typical causes of down time. Below are the results.

No alt text provided for this image

?The outcome of this survey is very much the same compared to other organizations who conducted these kinds of surveys over the last few years. However, we also asked another question which gave a very interesting result.

No alt text provided for this image

This clearly indicates that the majority of the data centers do know that these issues can be prevented. But what pro-active actions do data center owners need to undertake to get this matter under control?

As said, most data centers have done their due diligence on the site infrastructure which is indeed the first step to take. The next part is to make sure the policies, procedures and work instructions are aligned with the business objectives. ISO standards are often used but there are a number of shortcomings. You can refer to this article to read more about why ISO standards are a great start but certainly not enough.

How about training? It is interesting to see that most data centers have little issues to throw in yet another network device costing 50-80k USD to add redundancy but at the same time they are not willing to invest dollars to properly train their staff. We frequently hear that data centers have their own training programs internally with the aim to reduce training cost. But how effective is that really? A great car mechanic doesn’t necessarily make him/her also a good driver. Same as with data center engineers, they might be great engineers but are they also really good at being a trainer and transfer knowledge? What about politics? Do you think that the senior engineer who has built up his/her competences over many years is willing to just share his/her years of hard-earned knowledge with a new person and therefore potentially making himself/herself less valuable? Let’s face it, we live in a world where competition is every where and senior engineers like to stay senior…

So how to fix the problem? There are two critical steps.

First of all, read up and/or download the DCOS?. Get access to it here. Do your own gap analysis, or even better, get an external audit conducted to ensure an impartial and unbiased view on the maturity of your processes etc. Click here if you want more information on DCOS? audits.

Second, for training, try out the DCPT by clicking here to see how you can, in a few simple steps, create a full-blown training plan for yourself, or your staff, based on the Data Center Competence Framework.

Got more questions? Email us at [email protected]

Mark Chappell - DCOM DCD

Data Centre Operations - Audit, Advice, and Guidance services

2 年

Interesting and thought-provoking read - thanks for sharing and highlighting this "issue". Investing in people training will not only upskill them but additionally helps to make them feel part of the team/company (and not just a number), and make them more engaged. Like the comment about investing in technology but not training - all too often true.

Ts. Rizman Yusam

Data Center Management Unit, Digital Consultation Division

2 年

Yes.. agreed... It's all about People, Process, Technology to achieve high availability in all aspects of DC operation.

回复
Bakir DAOUD

???? ???? ????data center facility expert #ATD #Algeria | GAAN Algeria

2 年
Organisation Resilience Management Pte Ltd

Managing Director at Organisation Resilience Management Pte Lt

2 年

Well said!

回复
Ursus Custer Oliveros

Cloud Computing, Digital Transformation, Project Management, Business Continuity and Data Centers.

2 年

Technology alone will not make a Data Center Rated/Tier 4, Process and People are equally important if not more so..

要查看或添加评论,请登录

Edward van Leent的更多文章

社区洞察

其他会员也浏览了