Disaster Recovery (DR)
What is it?
Organization's ability to respond to and recover from an event that negatively affects business operations.
The goal is to reduce downtime, data loss and operational disruptions while maintaining business continuity by restoring critical applications and infrastructure ideally within minutes after an outage.
What is a disaster?
These events are often thought of in terms of natural disasters, but they can also be caused by systems or technical failures, human errors or intentional attacks. These events are significant enough to disrupt or completely stop critical systems and business operations for a period of time.
How to plan it?
Once an organization has thoroughly reviewed its risk factors, recovery goals and technology environment, it can write a disaster recovery plan. The DR plan is the formal document that specifies these elements and outlines how the organization will respond when disruption or disaster occurs. The plan details recovery goals including RTO and RPO, as well as the steps the organization will take to minimize the effects of the disaster.
Key personnel and DR team contact information.
A risk assessment and BIA to identify potential threats, vulnerabilities and negative effects on business.
An updated IT inventory that includes details on hardware, software assets and essential cloud computing services, specifying their business-critical status and ownership, such as owned, leased or utilized as a service.
A plan outlining how backups will be carried out along with an RPO that states the frequency of backups and an RTO that defines the maximum downtime that's acceptable after a disaster.
A step-by-step description of disaster response actions immediately following an incident.
A diagram of the entire network and recovery site.
Directions for how to get to the recovery site.
A list of software and systems that staff will use in the recovery.
Sample templates for a variety of technology recoveries, including technical documentation from vendors.
A communication that includes internal and external contacts.
A summary of insurance coverage.
Proposed actions for dealing with financial and legal issues.
Building a DR team
A DR team is entrusted with creating, documenting and carrying out processes and procedures for an organization's data recovery and business continuity in the event of a disaster or failure.
Identify the key stakeholders. Determine who within the organization should be involved in the disaster recovery planning process. A DR team typically includes cross-departmental employees and executives, such as the chief information officer, IT personnel, department heads, business continuity experts, impact assessment and recovery advisors and crisis management coordinators.
Define roles and responsibilities. Once the members of the DR team are determined, the next step is to assign them specific roles and responsibilities to ensure effective management of the recovery process. Common roles include team leaders, IT experts, business continuity experts, disaster recovery coordinators and department liaisons.
Assess expertise. If the organization lacks internal expertise, it can outsource or engage a service provider. These providers can offer external expertise to aid the team, deliver disaster recovery as a service (DRaaS).
Develop a recovery plan. The team should outline a detailed disaster recovery plan that outlines procedures for responding to various types of disasters. This plan should include steps for data backup and recovery, system restoration, communication protocols and employee safety procedures.
领英推荐
Train team members. It's important to teach and train team members on their responsibilities within the disaster recovery strategy. This could entail doing frequent drills and simulations to evaluate the plan's efficacy and pinpointing areas in need of development.
Regularly revise the DR plan. The disaster recovery plan needs to be reviewed and updated regularly to reflect organizational changes and how they affect the recovery process.
Document the procedures. All procedures and protocols within the DR plan should be documented in a clear and accessible format. This ensures that team members can easily reference and follow the necessary steps during a crisis.
DR sites
An organization uses a DR site to recover and restore its data, technology infrastructure and operations when its primary data center is unavailable. DR sites can be internal, external or cloud-based.
An organization sets up and maintains an internal DR site. Organizations with large information requirements and aggressive RTOs are more likely to use an internal DR site, which is typically a second data center. When building an internal site, the business must consider hardware configuration, supporting equipment, power maintenance, heating and cooling of the site, layout design, location and staff.
External disaster recovery site is owned and operated by a third-party provider. External sites can be hot, warm or cold.
Hot site. A hot site is a fully functional data center with hardware and software, personnel and customer data, which is typically staffed 24/7 and operationally ready in the event of a disaster.
Warm site. A warm site is an equipped data center that doesn't have customer data. An organization can install additional equipment and introduce customer data following a disaster.
Cold site. This type of site has infrastructure to support IT systems and data, but no technology until an organization activates DR plans and installs equipment. These sites are sometimes used to supplement hot and warm sites during a long-term disaster.
A cloud-based disaster recovery site is another option, which is also scalable. An organization should consider site proximity, internal and external resources, operational risks, service-level agreements (SLAs) and cost when contracting with cloud providers to host their DR assets or outsourcing additional services.
Tiers of DR
Tier 7. highly advanced level of disaster recovery capability. At this level, artificial intelligence and automation are likely to play a key part in the recovery process.
Tier 6. disaster recovery capabilities are comparable to Tier 5's, but they often include even more sophisticated technology and techniques for rapid recovery and minimal data loss.
Tier 5. often implies advanced disaster recovery capabilities beyond a hot site. This can include capabilities such as real-time data replication, automated failover and enhanced monitoring and administration tools.
Tier 4. This tier includes a hot site, which is a DR site that's fully functioning and ready to use. Hot sites replicate the primary data center's systems and operations in real time, enabling quick failover and minimal downtime. They provide the maximum availability and recovery speed, but they're also the most expensive alternative.
Tier 3. By electronically vaulting mission-critical data, Tier 3 options improve upon the capabilities of Tier 2. Electronic vaulting of data involves electronically transferring data to a backup site, in contrast to the traditional method of physically shipping backup tapes or disks. After a disaster, there's less chance of data loss or re-creation because the electronically vaulted data is usually more recent than data sent through conventional means.
Tier 2. This tier improves upon Tier 1 with the addition of a hot site, which are disaster recovery locations that have hardware and network infrastructure already set up to facilitate faster recovery times.
Tier 1. This level consists of cold sites that provide basic infrastructure but lack preinstalled systems. Businesses in this category have data backups, but recovery involves manual intervention and hardware configuration, which lengthens recovery times.
Tier 0. This tier denotes the lowest preparedness level and is usually associated with organizations that don't have disaster recovery or off-site data backups. Because recovery in this tier is entirely dependent on on-site technologies, recovery times can be unpredictable.
Importance
Disasters can inflict damage with varying levels of severity, depending on the scenario. A brief network outage could result in frustrated customers and some loss of business to an e-commerce system. A hurricane or tornado could destroy an entire manufacturing facility, data center or office.
Also, the shift to public, private, hybrid and multi-cloud systems and the rise of remote workforces are making IT infrastructures more complex and potentially risky. An effective disaster recovery plan lets organizations respond promptly to disruptive events, offering the following benefits in return:
cost reduction / data loss reduction / business continuity / compliance / prepared for emergencies