Highly available services: The non-techies introduction
Víctor Román Archidona
CEO & Privacy-Focused SRE | Building Human-Centric Cloud Infrastructure at LoadFront | Remote-First Technical Leader
The magic behind always-on
You have landed on a series of articles explaining what is and how to achieve?software services' high availability.
In them, I explain how to create and put such services in a way they can work most of the time.
This entry is a necessary, non-technical introduction. Expect the next ones to be written for technical audiences using UNIX-Like operating systems, like Linux and FreeBSD.
A (computer) services world
We cannot conceive today's computing without the so-called?services.
Services are a mix of software (computer applications) and hardware permanently running somewhere, connected to a network, and whose mission is to transform and transmit information.
If any of the elements in the mix do not work, then the service could be useless, and it would not receive or transform the information it manages.
A service can be simple, such as a unique program on a single server (or domestic computer) or very complex, formed with many computer programs.
Services can provide countless different functions. Some display web pages like this one; others send an SMS to a cell phone after something happens, and others take input like an amount expressed in US dollars and transform to equivalent in Europe's EURO. Possibilities are infinite. They can even use other services.
Top-used sites like Google or Facebook run many applications on thousands of servers and other gear to serve their search, maps, photography, social network, and related services they offer.
When we talk about services as a whole, we refer to all necessary things like software, hardware, networks, and others that compose it.
What is availability?
Before we can answer what is high availability, we must know what availability is.
This term means that a service is accessible and functional. A typical service is a web server, where there are pages stored and accessed from web browsers.
When users, through a web browser, tries to visit an current web site and it does not load, they experience an "unavailable service". Such lack of availability tends to lead to users stopping using that service.
A service may use other services: A web page may use a database to fetch and display some content, and that database must be available for operation.
Therefore, "availability" is what allows an existing service to be accessed and used without inconvenience.
A service that works as expected most of the time makes it reliable. Reliability is something we are going to talk about in other posts.
Availability time
A standard metric for high availability is a measure of how long the service has been available over a specific period, and is habitually represented as a percentage.
领英推荐
This percentage is called "availability time" or "(service) uptime". The time the service has not been available is called "downtime".
To calculate this time and percentage, we need to determine how long is the period we're going to measure, commonly expressed in minutes.
A day has 1440 minutes, a week 10080 minutes, a month 302400 minutes, and a year 525600 minutes. Minutes for months and years vary based on the days they have.
Then a simple formula is applied to calculate such availability:
(total period minutes — unavailable minutes) / total period minutes * 100
The result is the availability percentage in the chosen period. If we measure 60 minutes of downtime over a day (1440 minutes), we have:
(1440–60) / 1440 * 100 = 95.83 % of availability that day
What is high availability?
We can define high availability as a process to achieve the least unavailable service time, even if some of the underlying components are failing.
The way to achieve high availability is to run services on an entirely redundant IT infrastructure carefully designed.
It is not hardware like network gear and servers and their components. The following is a non-complete list of elements:
There are many more behind the scenes. Top services furthermore get replicated in different world regions, in different data centres to be more available, among other benefits.
Not all services require the same level of high availability: Critical services like user login on a popular website needs to be much more available than a small user login on a personal page.
Note that services cannot be more available than the underlying components. If the previous service relies on a database available 50% of the time, the service could not be available more than that.
Why high availability?
We need high availability to ensure the services are performing most of the time.
From a user point of view, thanks to high availability, we can use the service at any, and, for companies offering services makes them trustworthy.
Moving forward
Now that we understand what high availability is, be prepared to achieve it. The next articles focus on ways to use and configure "floating" (virtual) IP addresses, load balancers like HAProxy, standard services redundancy, and technologies like containers and container orchestrators, among others.