The nuances of corporate resilience
Frederik Borup Helweg-Larsen
Cyber Sikkerhedsr?dgiver | Risikostyring | Kriseledelse | Beredskab | Procesautomatisering | Beredskabs?velser | R?dgivning af bestyrelser | Datadrevet sikkerhedsoverblik | Kaptajn og Forbindelsesofficer for Cyber
by Frederik Helweg-Larsen, CISM, [email protected]
When you have a dialogue on the subject, it is often completely unclear what "corporate resilience" means. In many places, different terms and definitions are mixed up, and it is also not helped by the fact that the different standards also use the terms differently.
Why terms can make or break your efforts
As you may be well aware, many different standards, legislation, best practices, and defining organisations use and define the various terms involved in creating corporate resilience in different, sometimes even contradictory ways. The most crucial aspect in this regard is to ask yourself some fundamental questions before starting the continuous journey of establishing corporate resilience:
Only when these central questions are posed can the foundational work begin. We hope that having this article to facilitate the conversation around a common understanding will help you on your way to succeed with corporate resilience.
By preparing a model that creates an overview of corporate resilience elements, I sought to clear up this confusion. As a prelude to the walkthrough of the different terms, two main concepts must be introduced:
Corporate resilience: Understood as a broad umbrella term, entailing the different activities that seek to bolster the organisation's ability to withstand disruptions, thereby entailing most of the different strategic activities described below.
Contingency plans: Formalised procedures that seek to address disruptions to the organisations by providing an alternative plan commonly known as plan B. Contingency plans are thus understood broadly as the different formalised procedures developed to bolster corporate resilience.
With these central concepts defined, we can begin diving into the different areas of the model presented below:
Let me talk you through the model, and we'll start from the bottom:
Purple boxes – The prerequisites for good corporate resilience
The purple fields at the bottom of the model are the company's basic prerequisites for contingency planning to function in practice and not just be an organisational exercise.
This foundation should consist of at least the following elements:
6. Technical Recovery Plans/Disaster Recovery Plans
These plans dictate a concrete step-by-step recipe for re-establishing a system and which prerequisites, such as licenses, backup, rights, etc., are necessary to carry out the recovery.
?The plans must only contain the essential information necessary for systems recovery and must not be confused with system documentation. If you need external consultancy assistance for the recovery, then this plan must contain the information that the consultant needs to be able to carry out the work.
?Of course, the plans must be stored in a place where you can still access them, even if the infrastructure is inaccessible. These plans are also used during normal operational breakdowns where systems must be re-established and are therefore not reserved for a crisis.
7. Playbooks
To counter a predictable scenario affecting several systems simultaneously, you can describe a good playbook for the specific situation. For example, this is what you see in a first aid book, where there is a sequence of different actions for, e.g., burns and drowning accidents.
It is common to have a scenario-based action plan for ransomware attacks. It describes the reaction pattern you agreed to, with specific prepared actions that can prevent the situation from developing negatively. As an example, these playbooks can describe how a network is quickly divided into segments (also called island operation) or how parts of the network are shut down.
A scenario-based action plan can be automated as scripts that carry out the actions quickly, as the reaction must be completed within a few minutes. Managerial approval should be given for these plans, as they often affect the company's activities dramatically and will initiate one or more business contingency plans.
8. Prioritisation and dependencies ?
If systems are to be recovered, they will be done according to a "queue system" in which the IT department operates. Here, you would first recover the IT infrastructure and then the systems that are a prerequisite for the rest of the systems to function. Based on the prioritisation of the importance of the systems, the business can ask to be serviced in a specific order that can be prepared.
This is often done based on an assessment of the business impact (called Business Impact Assessment, BIA). It is a prerequisite that the most critical systems and services are tackled first and that critical business processes are re-established as soon as possible.
Since there are often several business areas, those responsible for the business areas must mutually agree on their priorities so that there are no internal conflicts in a crisis situation. This order can be changed in the situation, but it is much faster than starting with something rather than nothing.
Dependencies between systems and services can be more difficult to uncover if you do not have a well-maintained CMDB (Configuration Management Database). If it is not well described, then it is beneficial to prepare some simple sketches showing basic dependencies, as it helps make decisions in a crisis and affects the recovery sequence.
9. Robust design and architecture
Part of the problems caused by a crash or a cyber attack can be reduced in advance with a robust architecture and systems design. One should aim for a correlation between a system's criticality assessment and the design's robustness. It is often an economic assessment, as it is expensive to have redundant systems.
Examples of robust design are redundant systems, duplicated services, segmented networks, backup designed for crisis recovery, and cloud services that can run independently of on-premise systems.
10. Alternative Services is the IT function's plan B.
Business continuity plans will describe the wishes for emergency operations based on some alternative services that must function in everyday life so that they are ready in a crisis.
This can be an alternative communication platform or daily data dumps that can ensure the continued operation of the business. These services can be "dormant" until needed and must again be independent of the shared networks.
This corresponds to the fact that, in the past, lists were printed out every day so that they could be used in a crisis. You can still do that, but time has passed since this approach was popular, as it is too resource-intensive and often does not solve all your needs. ?
Blue boxes – the daily operations
The blue fields are the areas that belong to the company's daily operations. This entire area aims to ensure that an incident does not develop into a crisis.
5. Operations consists of:
You can see a scale from one to five on the model's left side. It is a scale for the severity of the incident where 1 is the most critical. The most serious event in an IT operations organisation is called a Major Incident. One or more major incidents will often draw on all the resources that an IT department has available, which means that the normal service level drops noticeably.
When a serious incident occurs, the IT department immediately tries to solve the problem by all conceivable means. If this cannot be done, the crisis management presses the button and takes over the crisis management. Notifying crisis management of all major incidents saves time.
It will typically be an IT manager in the company who has the role of IT crisis manager, thus deciding whether there is a crisis.
领英推荐
Red boxes – Crisis Management
?The red boxes are about the part of corporate resilience that is activated the second an incident has passed and has become a crisis.
This includes areas such as:
2. IT Crisis Management
This is the IT department's crisis management and is a layer on top of the Major Incident. A war room is run with a clear division of roles following a structured agenda. More resources are often brought in, and decisions are made with a prepared mandate. This is where the big overview is created and maintained, and there will often be roles representing HR, communication, coordination, business management, and facilities.
In this team, the long-term efforts are coordinated, prioritised, and communicated to all stakeholders. IT crisis management can obtain support from outside specialists to handle the situation and take responsibility for deviations from policies and guidelines.
3. IT Service Continuity
For many, this is an almost unknown concept and can only be defined and implemented in close cooperation with the business management. Emergency operation is not the same as redundancy, but alternative systems or data extraction will often be made available in other ways. If they fail, it can be a spreadsheet ready to replace quality management systems.
Often, only a limited amount of data is needed to lead one's business forward in a crisis, but it is enormously vital data. It may well be possible to deliver goods from a warehouse with a pen and a pad, but it will take a long time to import this data when the systems are back. Therefore, an alternative digital solution is often better to avoid major backlogs.
4. Business Continuity
These plans describe how the business will continue if you suddenly cannot continue operations through your normal business processes. These plans are not prepared by the IT department but by the employees who know the work processes. They know their own area's daily routines, needs and rules. It is often those who are responsible for processes and business areas who are responsible for the preparation of business continuity plans.
Business continuity is often used as an umbrella description for crisis management. But that's just one area of overall corporate resilience. Whatever you call it, be specific.
1. Corporate Crisis Management
This is the entire organisation's crisis management. If IT is down, it can drag the whole company down, and then the whole company is in crisis. Then, the IT crisis management, with the IT director at the end of the table, resolves the IT crisis, while the overall crisis management is handled here. Here, it is usually the CEO who sits as the crisis manager.
This plan is not limited to IT incidents but can also accommodate war, extreme weather, pandemics, etc. However, an IT crisis will very often activate the entire company's contingency efforts. Therefore, the two crisis teams must cooperate closely and well.
Testing the plans
All plans should be tested regularly with scenarios that are realistic and challenging. The plans must be tested individually to make the testing as concrete and specific as possible.
Preparing a test plan that extends over three years is a good idea, as it is often impossible to test all plans within 12 months.
?The testing of the plans must be adapted to a specific purpose and have an appropriate level of ambition. It is important that concrete learning comes out of the tests, and therefore, it is advantageous to create a test plan that increases complexity as you improve. It could be in these steps:
Good advice - and the human aspects
When a crisis arises, it is often necessary to control the course of the battle with a military mindset.
You have to run fast. Decisions must be made with a sturdy hand, and the employees must make an extra effort. Therefore, one must also have an eye on the human aspect of the crisis.
Often, through an attack, people are pushed very hard for a long time, perhaps around the clock, and therefore, they will, at some point, no longer be able to sustain the momentum. The HR function will safeguard employees' well-being and prevent stress and a bad working environment. It is also important to remember that most employees do not have a contractual obligation to work extraordinarily in a crisis, which must be handled carefully.
Demant and M?rsk are examples of companies where a cyber attack hit, lasted for several months, and some systems never recovered. The employees also do not forget how hard it was to be part of the incident, and the handling becomes part of the company's future image.?
Don't pave the road while you are driving it
It should go without saying that corporate resilience must be established before a cyber attack occurs.
Building your corporate resilience takes time, and you don't have much of that when attacked.
The time spent preparing and making plans is taken directly from the time to recover after a cyber-attack. The decisions that must be made under tremendous time pressure are often tricky, and having these discussions before the incident occurs is better. Thus, the pace during a cyber-attack can increase, and there is less insecurity as you can adhere to well-known frameworks.
If you seek inspiration for good contingency planning, look toward the Defence, the Rescue Services or general First Aid. Here are methods tested in life-threatening situations and adapted through generations. Use their operational experience and notice how simple and concise the methods often are.
Recommendations for inspiring reading material:
What you can do today
Having read this article, you now have an understanding of the different elements central to building organisational resilience. But remember, knowledge is power, and power is expanded when shared rather than diminished. By bearing this concept in mind, our immediate recommendation is to begin by focusing on these simple steps:
?
?
?
?
Dedicated to making an impact in Continuity, Crisis management, IT-security and/or IT Governance I focus on my customers specific needs and culture when creating and implementing solutions.
4 个月Useful tips