What’s the Difference Between Fault Tolerance and High-Availability

What’s the Difference Between Fault Tolerance and High-Availability

Fault tolerance and high-availability

Fault tolerance and high-availability are two terms that are often used interchangeably in IT circles. The truth is, though, that there are several important distinctions between a fault-tolerant system and a high-availability system. If you are considering upgrading to one of these two systems, it’s important to understand the unique advantages that both systems offer.

What is Fault Tolerance?

Like high-availability, fault tolerance is designed to minimise downtime. However, the methods used to minimise downtime in a fault-tolerant system differ from those used by a high-availability system. In the end, a fault-tolerant system is designed to enable the system to continue operating even if one of its components goes down.

There are several different methods of fault tolerance that you will want to be aware of. These methods include:

Triple Modular Redundancy 

In a triple modular redundancy fault-tolerant system, redundancy is achieved by having three different systems set up to perform the same process. The results that these systems produce are then checked by a majority voting system, which then produces a single output. In the event that one of the three systems fails, a correct output can still be generated since the other two systems will still provide a correct output to the majority voting system.

Forward Error Correction

Forward error correction involves adding redundancies directly to the message that a system sends out rather than the adding redundancies to the system itself. By adding redundancies within the message itself, the receiver is able to verify the data and correct certain errors that are caused by unstable or noisy channels.

Checkpointing

Checkpointing is one of the most common methods of fault tolerance and is used regularly in common applications such as word processors. This method involves automatically saving data periodically so that the system can be restarted back to its saved state in the event of a crash. While checkpointing may seem simple enough, it can actually become a complicated process when you are backing up data on whole distributed systems. However, there are a number of solutions such as Distributed MultiThreading CheckPointing that simplify the process and allow you to checkpoint the status of multiple distributed systems.

Byzantine Fault-Tolerance

Byzantine fault-tolerance is essentially a combination of all the above methods. This multi-faceted approach to fault tolerance is designed to deal with situations where the majority of your system’s monitoring modules are not able to reach a consensus on what a given output should be. There are numerous solutions that Byzantine fault-tolerance relies on an order to address this problem. For now, though, suffice it to say that Byzantine fault-tolerance is the most comprehensive approach that you will have available when building a fault-tolerant system.

 

What’s the Difference Between Fault Tolerance and High-Availability?

While high-availability systems and fault-tolerant systems are both designed to accomplish basically the same objective, there are a number of important distinctions between the two approaches. One key difference is that high-availability systems are designed to both limit downtime as well as keep the performance of the system from being negatively affected. With a fault-tolerant system, downtime is still limited, but maintaining performance isn’t as much of a priority.

While this makes it sound as if high-availability systems have a clear advantage, there is an important benefit to fault tolerance that must be taken into account as well. If an error occurs during an active action in a fault-tolerant system, the correct end state of that action will still be outputted. This is not the case with a high-availability system.

For example, if a user submits a request to your website that is hosted on a high-availability platform and a node crashes, the user will be given a 500 error message. However, the system will still remain operational and will be able to respond to new requests. With a fault-tolerant system, though, the failure is worked around and a valid response is still displayed to the user – though it might be delayed. This is the most important distinction between high-availability and fault tolerance that you will want to keep in mind when deciding which system is best for your organisation.

 

Conclusion

Both high-availability systems and fault-tolerant systems excel at preventing downtime and ensuring that single failures don’t crash the entire system. In the end, whether high-availability or fault tolerance is the right choice for your organisation comes down to your specific priorities and requirements.

If you would like to learn more about creating either a fault-tolerant or high-availability system, we invite you to contact us today. At Servers Australia, we are dedicated to helping organisations of all sizes eliminate downtime through effective technological solutions such as fault tolerance and high-availability, and we would be happy to work with you to help you develop the perfect approach for your specific organisation. Servers Australia is an Enterprise Partner with VMware which has allowed us to deliver industry-leading Fault Tolerance and High Availability solutions throughout our Data Centres

要查看或添加评论,请登录

Joshua Cefai-Cox的更多文章

  • Data Centre Migration

    Data Centre Migration

    How to migrate your Data Centre One of the few constants in business is change. Companies change ownership, merge…

  • Minimising Server Downtime

    Minimising Server Downtime

    Have you ever thought about or maybe even experienced the significant impact that server downtime can have on your…

  • Equinix - Know your Data Centre

    Equinix - Know your Data Centre

    Servers Australia is proud to have partnered with Equinix? who have spent over the last 20 years becoming one of the…

  • A Simple Guide to Colocation

    A Simple Guide to Colocation

    As more businesses are closing down their offices to reduce costs and more employees are working from home; you would…

  • 2021 Market Trends

    2021 Market Trends

    Since 2020, there has been a need in the market to shift existing business strategies due to the COVID-19 pandemic…

  • CapEx vs OpEx

    CapEx vs OpEx

    Reducing both CapEx and OpEx is often on the foremost goals of organisations regarding their IT department. In case you…

  • Tips to Keep Your Business Safe from Ransomware Attacks

    Tips to Keep Your Business Safe from Ransomware Attacks

    Ever since the famous ‘WannaCry’ ransomware attacks started affecting millions of computers connected to the World Wide…

    1 条评论
  • 5 benefits of moving to Cloud Servers

    5 benefits of moving to Cloud Servers

    What if you found a simple way to cut costs, take advantage of economies of scale, and focus on your core business? By…

  • IaaS vs DRaaS vs PaaS – What is Right for Your Organisation?

    IaaS vs DRaaS vs PaaS – What is Right for Your Organisation?

    When it comes to outsourced IT solutions, businesses and organisations have several beneficial options to choose from…

  • What are Backups and Why Does Your Business Need it?

    What are Backups and Why Does Your Business Need it?

    It goes without saying that businesses spend a lot of time and money creating and acquiring their valuable data. Still,…

社区洞察

其他会员也浏览了