登录查看更多内容

Reliability: Non-Stop Operation is the Ancelus Goal.

John Layden

发布日期: 2019年9月20日

Uptime is important. In operational systems it can be critical. To eliminate the root causes of downtime requires a passionate focus on the mundane world of what happens in the real world rather than the demo. Most DBMS vendors would rather not talk about it. Ancelus developers fixate on it. The hardest part is to contain unplanned downtime.

First, we need to recognize that all systems crash. We’ve made the core elements of Ancelus as bullet proof as possible through years of testing. But that isn’t always enough. Sometimes it’s a hardware failure. Or it might be a power spike that gets past the UPS, or a problem with the application code. Or maybe a cosmic ray flips a bit (the explanation when we can’t find a cause). But with Ancelus it doesn’t need to be a 5-alarm crisis. Once it happens the immediate priority is to return to service with up-to-the-event data.

Several modes of system failure can be repaired in Ancelus with no downtime at all. If an index is corrupted there is a Fix utility that detects and repairs the indexes while the database is live – no downtime needed.

If an application failure leaves stale locks behind, the Ancelus Lock Monitor tracking utility can detect the offending thread and release the locks in small fractions of a second – no downtime needed.

In the extreme case where the database must be restored, it can be accomplished in a few minutes from the last full backup plus journal update, rather than many hours of load-and-index. High speed utilities plus integrated indexes make it possible.

This demonstrates the durability of the Ancelus system using the real-time journal. There are two other methods of retaining durable data. The snapshot backup for small data sets (generally under 2 GB) can deliver a quick full backup automatically every X seconds. It does a memory copy and writes to disk in the background. This puts a small amount of data at risk (the amount inserted in X seconds) and has a slower recovery mode (first, find the last good backup). For larger systems and those that cannot tolerate downtime a real-time replicate with hot fail-over will duplicate every transaction to two databases, detect the failure of the primary, switch to the secondary, and then repair and re-synchronize the primary.

Whatever creates it, the expectation of DBAs is to get it back quickly. The cost of downtime in lost productivity, customer aggravation, process control upsets, and service level violations is simply too high. Not to mention the DBAs reputation.

Craig Mullins

Craig Mullins, President & Principal Consultant at Mullins Consulting, Inc. IBM Gold Consultant and IBM Champion for Data and AI

5 年

Eliminating the root causes of downtime is, indeed, an admirable goal. DBAs are always looking for ways to reduce downtime - and especially to minimize the amount of their precious free time required to keep their database systems up and running.?

1 次回应

要查看或添加评论，请登录

John Layden的更多文章

The Importance of Lists, Schema Demo

2019年12月1日

The Importance of Lists, Schema Demo

In this post we will offer several examples of Ancelus linked lists, some history on how and why the table-based model…

3 条评论
Ancelus Integration with Existing Oracle or DB2 Applications

2019年10月30日

Ancelus Integration with Existing Oracle or DB2 Applications

What if my current application works fine except for a few trouble spots? A hybrid approach may offer a simple solution…
What’s Special About Ancelus Concatenated Keys?

2019年10月23日

What’s Special About Ancelus Concatenated Keys?

The example above shows how a pricing is handled where more than one vendor exists for each part, and more than one…

1 条评论
Complexity: Major Relational Challenge Solved by Ancelus

2019年10月17日

Complexity: Major Relational Challenge Solved by Ancelus

Most database system designs have self-censored the level of schema complexity based on the general understanding of…
Big Data Puts DBAs in a Vice: A New Approach Emerges

2019年10月10日

Big Data Puts DBAs in a Vice: A New Approach Emerges

Shortly after Y2K (remember that?) the industry focus shifted to the challenge of the explosive growth in the size of…

6 条评论
Table Joins: Benchmarks are Fine. What About Doing Real Work?

2019年10月2日

Table Joins: Benchmarks are Fine. What About Doing Real Work?

In the real world we don’t build systems from benchmarks. The practical work in database systems gets done in table…

1 条评论
Reliability: Eliminating Planned Downtime

2019年9月26日

Reliability: Eliminating Planned Downtime

In our prior post we discussed the issue of recovering from unplanned downtime. The other side of the coin is the time…

1 条评论
Speed is the Reason for any Database

2019年9月16日

Speed is the Reason for any Database

Speed is the most fundamental measure of database performance. The patented Ancelus database handles a simple R/W…

1 条评论
Jobs Reported at +313,000 for February - Greatly Understated

2018年3月10日

Jobs Reported at +313,000 for February - Greatly Understated

February jobs reported at +313,000. Real number was +785,000.

1 条评论
Supply Chain Management Systems Have Promised to Transform Business for 25 Years. Where are the Results?

2018年3月2日

Supply Chain Management Systems Have Promised to Transform Business for 25 Years. Where are the Results?

Two new papers measure the impact of SCM systems on durable goods performance in the US. The amount of inventory is…

See all articles

Reliability: Non-Stop Operation is the Ancelus Goal.

John Layden

John Layden的更多文章

社区洞察

其他会员也浏览了

Transactions | Weak Isolation Levels & Multi-version Databases

High Availability for Databases: Pros, Cons, and Hidden Costs

Securing Key Credentials in Code

Systemd Journal Logs with journalctl on Red Hat 9

Prometheus or InfluxDB ? which one is better for storing Performance testing results

Block vs Byte Level Replication Discussion

The Critical Role of Total Data Migration in Database Recovery and Management

Apollo Client Error Handling

Failover groups versus Restore

Maximize Your Data Security!

John Layden的更多文章

The Importance of Lists, Schema Demo

Ancelus Integration with Existing Oracle or DB2 Applications

What’s Special About Ancelus Concatenated Keys?

Complexity: Major Relational Challenge Solved by Ancelus

Big Data Puts DBAs in a Vice: A New Approach Emerges

Table Joins: Benchmarks are Fine. What About Doing Real Work?

Reliability: Eliminating Planned Downtime

Speed is the Reason for any Database

Jobs Reported at +313,000 for February - Greatly Understated

Supply Chain Management Systems Have Promised to Transform Business for 25 Years. Where are the Results?

社区洞察

其他会员也浏览了

Transactions | Weak Isolation Levels & Multi-version Databases

High Availability for Databases: Pros, Cons, and Hidden Costs

Securing Key Credentials in Code

Systemd Journal Logs with journalctl on Red Hat 9

Prometheus or InfluxDB ? which one is better for storing Performance testing results

Block vs Byte Level Replication Discussion

The Critical Role of Total Data Migration in Database Recovery and Management

Apollo Client Error Handling

Failover groups versus Restore

Maximize Your Data Security!