In simple words - AWS Architecture for Disaster Recovery Strategies
Aviv Graupen
AI/ML technical specialist | Data center architect | Technical trainer | MBA, BSC
In simple words - Disaster Recovery - process of preparing for and recovering from a disaster.
In simple words - RPO - how much data can you afford to recrate \ lost in terms of data loss ?
In simple words - RTO - how quickly must you recover ? what is the cost of the downtime ? in terms of Downtime ?
In simple words - Disaster Recovery is a part of your resiliency strategy yet it can be compered to availability which is also a part of your resiliency strategy - so what's is the difference ?
Disaster Recovery this will measure the objective (RPO;RTO) for a one time events while availability will be for a period of time.
In simple words - in the cloud as related to Disaster Recovery we should think\use a multi-AZ (=events at the regional disasters) or Multi-Region (=multi DC across distinct locations) Strategy - the Disaster Recovery strategies are spilt to 2 kinds active/passive and active/active
In simple words - there are 4 common Disaster Recovery strategies:
Backup & restore (active/passive):
the active left side is the production region while the right is your DR region.
This isn't live, whenever is the a DR event happens there will be a cross-region backup (=copy) and than a restore may be used.
Pilot Light (active/passive):
领英推荐
the active left side is the production region while the right is your DR region yet using Aurora DB this time there is a Async cross-region replication (not real time) usage between the regions. Also Route53 (using health check) can remove the Active route and start sending traffic to the other route than once u start getting traffic u will start adding the compute instances by ASG.
Warm standby (active/passive):
Very similar to the Pilot light yet in the DR site there are already minimal compute resources up & running. A better serving time than Pilot light since there are resources at the DR site already running.
Multi-site (active\active):
both routes are active - route 53 will decide traffic upon latency, geo based etc.
DynamoDB is being used here for Sync replication usage between regions.
In simple words to conclude - we need to look at the cost, complexity vs. Data loss and service interruption for Disaster Recovery.
looking at graph below:
"Backup&restore" approach will be the cheapest way to go yet its data loss will be massive yet "Multi site active/active" approach will be real time solution yet it will be the most expensive method.
Disaster Recovery events (cyber attacks, Hw. issues , networking issues, floods, fires etc.) are optional events to your BUs therefore a correct questions need to be ask to decide what is the correct RPO,RTO to be defined.
Executive | CEO | Business Development | Global Marketing | Strategy | Entrepreneur | C-Level Trusted Advisor | Result Driven | Leading Opening of an International New Market to Generate Revenue
2 个月???? ??? ?? ?? ???????? ????? ??? ???? ????? ????: https://bit.ly/3OVndCj
Owner at Plan(a-z) | Leading Marketing & Business Dev. for premium brands | Ex. CEO of Y&R Israel
3 个月???? ??? ?? ?? ??????. ??? ????? ???? ?????? ???: ?????? ????? ??? ??????? ?????? ??????, ?????? ?????? ??????,?????? ????? ????????. https://chat.whatsapp.com/IyTWnwphyc8AZAcawRTUhR
CEO @ Immigrant Women In Business | Social Impact Innovator | Global Advocate for Women's Empowerment
7 个月???? ??? ?? ?? ???????? ??? ?????? ???? ?????? ???: ?????? ????? ??? ??????? ????? ????? ?????? ??????. https://chat.whatsapp.com/BubG8iFDe2bHHWkNYiboeU