IT Disaster Preparedness
Zazzle

IT Disaster Preparedness

Many mature organisations have Business Continuity Plans and Disaster Recover Plans. Some of them even get tested regularly (although it is surprising how many never are, even in the largest and most critical organisations). But how many of them are truly fit-for-purpose in the event of a real-life disaster?

In my experience, the answer is "surprisingly few". More often than not, when a major IT outage or cyber attack occurs, the formal plans stay on the shelf, and we rely on a frighteningly small number of talented individuals to "wing it" and chart a course to recovery. These individuals need to have not only deep technical skills, but a broad understanding of the organisation's IT landscape and how its IT systems support key business processes.

Usually our hard-working IT teams manage to "muddle through" and restore service, but often it takes longer than it needs to, and results in excessive risk and pressure. We need to do better than this.

Why does this happen?

An effective DRP can only be created based on a proper understanding and agreement across the organisation on which business processes are truly critical; which of them are completely dependent on IT systems; and which of them have temporary manual workarounds. This is far more complex than it seems. Often I have seen this planning done in a very "shallow" manner that results in some of these traps:

  • ?"Everything is critical" - if business groups are asked questions such as "how critical is system X" or "how long can system X be down", without any mature discussion on trade-offs and costs, it is hardly surprising when the answer ends up as "we want it all and we want it now". This adds no value to disaster planning - no organisation can provide an "always-on" level of service for every system. Executives need to be held accountable for the costs and consequences of their informed decisions, and be challenged to consider alternatives and compromises.
  • "Let's just focus on a few key systems" - often business continuity planning is seen as "an issue for IT to work out", and for IT people it is convenient to focus on disaster recovery for only a handful of systems that are obviously heavily utilised (such as ERP). These are often the systems that already have high levels of support and redundancy in place, so this approach reduces effort and complexity. However, in my experience, it's often the wrong focus. In a real-life disaster, you learn that the systems that the organisation truly can't live without might not be the "big-ticket" ones; they may actually be the small departmental applications that aren't widely known (in fact they might not even be supported by the IT group!). The only way to avoid this trap is to truly challenge stakeholders to imagine scenarios where all systems are down, and plan out what they would do.
  • "Focus only on physical disaster scenarios" - old DRPs used to be created based on easy-to-imagine scenarios such as "earthquake takes out the data center". In today's world, are these really the most likely scenarios to plan for? For example, what happens when you have a cyber attack which renders your backup data center just as unusable as the primary one?
  • "If it's an IT problem, then it must be an IT solution" - often the teams that operate business processes have become so dependent on modern IT systems for so long, that they have forgotten how these processes used to be operated before the systems were implemented. In most organisations, many critical processes can be operated manually - albeit with great inconvenience and inefficiency - if the right thinking is put in beforehand. When doing this, ensure that you think through the worst-case IT impacts. For example, if the ordering system is down, the sales team might be able to call their key customers by phone to take their orders; but how will they get the customer contact details if they are unable to access the company's email system and their phones lose their company contacts?

No alt text provided for this image

Doing it right

"By failing to prepare, you are
preparing to fail" - Benjamin Franklin

Avoiding these traps and doing good planning is not necessarily complex, but it takes real effort, commitment and partnership across the organisation. When everyone is busy running and growing the business, it is hard to find time to commit to business continuity planning. IT needs to be heavily involved and support this activity, but it should not be expected to lead it. Consider bringing in risk/crisis experts if you don't have them in-house.

If you already have your organisation's commitment to do this, well done, and I hope the above points help to ensure that the process is effective from an IT perspective. If you don't, then you need to get the attention of the key stakeholders by explaining what would happen in the event of a realistic disaster scenario. Sharing real-life experiences from other organisations can be far more effective than a theoretical conversation. I urge you to put in the effort - one day it will be paid off in spades.

要查看或添加评论,请登录

Ellis Brover的更多文章

社区洞察

其他会员也浏览了