Disaster Recovery Testing
Testing Times
This article examines the necessity for testing the Disaster Recovery Plan (DRP) that is a prime deliverable of Business Continuity Management (BCM). It also explores the progression through testing towards an effective DRP capability.
The article is delivered as a question and answer session. It begins (questions 1-4) by describing the DRP and placing it in its BCM context. Following questions track the development towards a tested DRP. The bonus final question may be useful when selling the need for DRP testing to your firm. I restrict myself to three responses to each question to prevent me from rambling on.
You are encouraged to reply with any other questions or statements you may have. My nine questions are:
1. Where do Disaster Recovery Plans come from ?
2. What is the point of a DRP ?
3. What happens if you do not have a DRP ?
4. How much does a DRP cost ?
5. If you do have a DRP how do you know that it works ?
6. What parts of a DRP need testing ?
7. What is the difference between a DR test and a real DR ?
8. Does the nature of the Business govern DRP testing ?
9. Is there any extra bonus from having a working DRP ?
Here we go. Bear in mind that my questions and answers are simply my opinions based on my experiences. Feel free to challenge both the questions and the answers.
1. Where do Disaster Recovery Plans come from ?
a. DRPs are developed from the Risk Assessment process and the subsequent Business Impact Analysis process. These processes are governed by the firm’s Security Policy.
b. The Risk Assessment (RA) process addresses the relative likelihoods that the firm will be threatened by disruptive externally driven events such as fire, weather storm, earthquake, pandemic, civil unrest or cyber crime.
c. The Business Impact Analysis (BIA) process addresses the relative damage to the firm of the loss of each internal business process. This loss may be financial, legal, reputational, market share, data integrity and skill base.
2. What is the point of a DRP ?
a. The DRP prepares the firm for the eventuality that disaster will strike and hit on the vulnerabilities identified by the RA and BIA processes.
b. The DRP provides a plan of action, calmly worked out and agreed in advance, designed to minimise the impact of the disaster event and get the business going again as soon as possible.
c. The DRP says who will do what, how and when they will do it and who they must ask or tell.
3. What happens if you do not have a DRP ?
a. Nothing, well nothing until a disaster event occurs. Then everything will happen at once but not necessarily in any hoped for order.
b. People will dither about, miss, miss-handle or duplicate recovery actions and probably talk to the wrong people about the wrong things, assuming they know how to contact them in the first place. Recovery will take forever and even then the results might be untrustworthy.
c. Competitors and the Media will find all this amusing. Clients, the firm’s business community, and your Board, will not.
4. How much does a DRP cost ?
a. To set up the DRP needs a BCM co-ordinator such as the ones who managed the RA and the BIAs. In addition, it needs technical recovery architects to design, agree and build the back-up and recovery procedures.
b. To run the DRP requires a management group to handle the Crisis and control all communications, a technical group to execute the recovery and a business process group to verify the results.
c. For every role in the DRP at least two people must be identified to ensure resilience of support and that there is no single point of failure.
5. If you do have a DRP how do you know that it works ?
a. You test it. Then you update the Plan and do it again (and again).
b. First gather all the documentation and all the involved parties together and walk through everything – from names to roles to addresses to phone numbers to recovery procedures to battle boxes. Are they all present and correct and is the declaration of Disaster and on-going tasks in the safe hands of a Crisis Management Team? Will that recovery really meet the Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO) for the business processes ? Who has the authority to declare a Disaster and trigger the DRP ?
c. Later move to the designated DR site, load it up and try a simulated recovery. No designated site ? Go back to 4a. Consider the implications of the simulation – Is the system as secure as Production ? Where will you get the data and are you allowed to use it for testing ? How will you be sure no transactions can leak between Production and Simulation during a test ? Do you have all the software licences necessary to run a copy of Production ? These answers are unstructured lists but they give a flavour of the thinking required when building up to a simulation.
6. What parts of a DRP need testing ?
a. All parts – in the event of a disaster the choice may be between running the DRP and closing your business. If it was your job, how well would you check that the DRP lifeboat was seaworthy ?
b. Consider 3rd Party suppliers. Many firms use 3rd Parties to supply services on which the firm’s vital business services may depend. Here both the firm’s DRP and the 3rd Party DRPs must come into co-ordinated play.
c. Checkout the contracts your firm has with its partners to see if there any DRP implications. It may well be that the DRP is contracted out in part or whole to a specialist DR 3rd Party.
7. What is the difference between a DR test and a real DR ?
a. Test DR looks for failure. Failure here is the detected errors in the recovery documentation. During any test these should be logged for later DRP update. It is important that even the experts follow the documentation and record any necessary deviations. In a real DR such experts may not be available.
b. Real DR looks for success. Get the recovery done as quickly and as cleanly as possible. Achieve the RTO and the RPO.
c. Both test and real DR must follow the documentation. Observers should be deployed to monitor that adherence and record any deviation taken.
8. Does the nature of the Business govern DRP testing ?
a. Yes .. if the business runs 24/7, like a Bank or a Hospital, then there is no opportunity to borrow on any equipment, circuit or network used by Production. Any simulation has to remain isolated (this isolation is the safest option anyway).
b. There may be hard restrictions on using even old Production data for testing if that data is sensitive or personal. Some kind of redaction technique may be required.
c. If the business uses in-house development systems it may be possible to use one to host a DR simulation system out of normal business hours.
9. Is there any extra bonus from having a working DRP ?
a. Streamlining – if a given business process does not need rapid, or even 2nd wave, recovery then maybe it is not needed at all. Such processes may be legacy systems no longer fir for purpose.
b. Less risk – if the firm has an active and well tested DRP then it is less of a risk as an enterprise. Perhaps then lower insurance premiums can be negotiated.
c. Sell the expertise – if the firm has progressed from a very basic BCM state to perhaps even full ISO22301 compliance then that journey may be a sellable experience.
That’s all for now folks.
Hope you found this interesting. Anyway round we live in testing times.
Comments welcome on [email protected]
Roger Jarvis MBCI, Fulham, London. December 2020