登录查看更多内容

Reliability of a Data-Intensive Application

Kumar Mohit

SAVP ( DSS - data solutions and services )

发布日期: 2023年6月3日

During the last decade, we have seen various technological developments that have enabled companies to build platforms, such as social networks and search engines, that generate and manage unprecedented volumes of data. These massive amounts of data have made it imperative for businesses to focus on agility and short development cycles, along with hypothesis testing, to allow a quick response to emerging market trends and insights.

Each piece of software is unique and must be treated as such, but it is also true that there are foundations shared among most software systems. The foundations can be reduced to Reliability, Scalability, and Maintainability.

In a data-intensive application, datasets are divided into smaller fragments and distributed over different geographical locations. For this kind of application to thrive, it must deal with massive amounts of users, in a continuous way, with the ability to improve on and fix the existing app in cases of emergency.

A data-intensive application is typically built from standard building blocks that provide commonly needed functionality. For example, many applications need to:

Store data so that they, or another application, can find it again later (databases)
Remember the result of an expensive operation, to speed up reads (caches)
Allow users to search data by keyword or filter it in various ways (search indexes)
Send a message to another process, to be handled asynchronously (stream processing)
Periodically crunch a large amount of accumulated data (batch processing)

Let's take a deeper look at what Reliability means :

Reliability simply means “continuing to work correctly, even when things go wrong.”

Undoubtedly there will be times when our webpage, application, or software system will fail. Even the most experienced programmer is prone to errors, as it is human nature to be imperfect. Other sources of fault can be hardware or even software. Regardless of the failure source, a system should continue to perform at the desired level even in adversity.

Fault and failures is two different things , fault can be defined as a component of system that is varying from actual spec or expectation where as a failure is when a system component is completely down and stopped working.

Hardware fault — e.g. disk failure (hardware redundancy), cloud instance went down (software level fault-tolerance technique such as running multiple instances of an application and using heart-beat exchange).
Software error — a systematic error, should be addressed with testing, monitoring and constant self-checking.
Human error — human operation could be unreliable, system need to provide clear interface, sandbox, easy recovery of human errors.

There are situations in which we may choose to sacrifice reliability in order to reduce development cost (e.g., when developing a prototype product for an unproven market) or operational cost (e.g., for a service with a very narrow profit margin) — but we should be very conscious of when we are cutting corners.

IBM Hybrid Cloud and Infrastructure 3 个月前

The Evolution of .NET Ecosystem: Innovations…

David Shergilashvili 1 个月前

Understanding Node.js Memory Leaks: A Simple Guide for…

Centizen, Inc. 2 个月前

Fault Tolerance :

Fault tolerance is the dynamic method that’s used to keep the interconnected systems together, sustain reliability, and availability in distributed systems. The hardware and software redundancy methods are the known techniques of fault tolerance in distributed system.

Fault Tolerance Mechanism in Distributed Systems:

The replication based fault tolerance technique is one of the most popular method. This technique actually replicate the data on different other system. In the replication techniques, a request can be sent to one replica system in the midst of the other replica system. In this way if a particular or more than one node fails to function, it will not cause the whole system to stop functioning. Replication adds redundancy in a system.

Major issues in this technique :

Consistency: This is a vital issue in replication technique. Several copies of the same entity create problem of consistency because of update that can be done by any of the user. The consistency of data is ensured by some criteria such as linearizability , sequential consistency and casual consistency etc. sequential and linearizability consistency ensures strong consistency unlike casual consistency which defines a weak consistency criterion. For example a primary backup replication technique guarantee consistency by linerarizability, likewise active replication technique.

Degree or Number of Replica: The replication techniques utilizes some protocols in replication of data or an object, such protocol are: Primary backup replication , voting and primary-per partition replication. In the degree of replication, to attain a high level of consistency, large number of replicas is needed. If the number of replica is low or less it would affect the scalability, performance and multiple fault tolerance capability. To solve the issue of less number of replica, adaptive replicas creation algorithm was proposed.

The main aim of ARC ( adaptive creation algorithm ) is to maintain a rational replica number, not only satisfying the user anticipant availability, improving access efficiency and balancing overload, but also reducing bandwidth requirement, maintaining the system's stability, providing users with the satisfaction of QOS.

Reliability of a Data-Intensive Application

Kumar Mohit

SAVP ( DSS - data solutions and services )

领英推荐

更多精彩文章

社区洞察

其他会员也浏览了

Elevating Operations with Retool for Nexify

REST API File Upload Best Practice

10 Proven Techniques to Reduce Latency in Software ??

Tracking Asynchronous Lambda Function Invocation Lifecycle

Design for Observability - Role of Metrics Ep 2

Kubernetes Affinity

Langchain: Not Production-Ready? Here's Why You Should Reconsider

API Gateway Benefits - Bhatia Bytes

"I am sure you have the data, but do you connect the dots!?"

Building Reliable, Scalable and Maintainable Applications

领英推荐

Handling Failures

2023年6月11日

Art of API Design

2023年5月31日

Software Architecture , Containers and Cloud Services

2023年5月27日

Sacrifice Your Suffering

2023年5月21日

Void , Substance and Chaos

2023年5月14日

The Balance of MVP ( minimum viable Product ) and MVA ( minimum viable Architecture )

2023年5月13日

Is man merely a mistake of God's? Or God merely a mistake of man?

2023年5月13日

Importance of Silence - Osho

2023年5月6日

Mental harassment at Work Place

2023年4月5日

Monitoring vs Observability

2022年12月23日