Does time progress at the same rate everywhere?

Does time progress at the same rate everywhere?

We all have an intuitive concept of time based on our own experience as individuals. Unfortunately, that intuitive notion of time makes it easier to picture total order rather than partial order. It's easier to picture a sequence in which things happen one after another, rather than concurrently. It is easier to reason about a single order of messages than to reason about messages arriving in different orders and with different delays.

However, when implementing distributing systems we want to avoid making strong assumptions about time and order, because the stronger the assumptions, the more fragile a system is to issues with the "time sensor" - or the onboard clock. Furthermore, imposing an order carries a cost. The more temporal nondeterminism that we can tolerate, the more we can take advantage of distributed computation.

There are three common answers to the question "does time progress at the same rate everywhere?". These are:

  • "Global clock": yes
  • "Local clock": no, but
  • "No clock": no!

The synchronous system model has a global clock, the partially synchronous model has a local clock, and in the asynchronous system model one cannot use clocks at all. Let's look at these in more detail.

Time with a "global-clock" assumption

The global clock assumption is that there is a global clock of perfect accuracy, and that everyone has access to that clock. This is the way we tend to think about time, because in human interactions small differences in time don't really matter.

The global clock is basically a source of total order (exact order of every operation on all nodes even if those nodes have never communicated).

However, this is an idealized view of the world: in reality, clock synchronization is only possible to a limited degree of accuracy. This is limited by the lack of accuracy of clocks in commodity computers, by latency if a clock synchronization protocol such as?NTP?is used and fundamentally by?the nature of spacetime.

Assuming that clocks on distributed nodes are perfectly synchronized means assuming that clocks start at the same value and never drift apart. It's a nice assumption because you can use timestamps freely to determine a global total order - bound by clock drift rather than latency - but this is a?nontrivial?operational challenge and a potential source of anomalies. There are many different scenarios where a simple failure - such as a user accidentally changing the local time on a machine, or an out-of-date machine joining a cluster, or synchronized clocks drifting at slightly different rates and so on that can cause hard-to-trace anomalies.

Nevertheless, there are some real-world systems that make this assumption. Facebook's?Cassandra?is an example of a system that assumes clocks are synchronized. It uses timestamps to resolve conflicts between writes - the write with the newer timestamp wins. This means that if clocks drift, new data may be ignored or overwritten by old data; again, this is an operational challenge (and from what I've heard, one that people are acutely aware of). Another interesting example is Google's?Spanner: the paper describes their TrueTime API, which synchronizes time but also estimates worst-case clock drift.


Time with a "Local-clock" assumption

The second, and perhaps more plausible assumption is that each machine has its own clock, but there is no global clock. It means that you cannot use the local clock in order to determine whether a remote timestamp occurred before or after a local timestamp; in other words, you cannot meaningfully compare timestamps from two different machines.

The local clock assumption corresponds more closely to the real world. It assigns a partial order: events on each system are ordered but events cannot be ordered across systems by only using a clock.

However, you can use timestamps to order events on a single machine; and you can use timeouts on a single machine as long as you are careful not to allow the clock to jump around. Of course, on a machine controlled by an end-user this is probably assuming too much: for example, a user might accidentally change their date to a different value while looking up a date using the operating system's date control.

Time with a "No-clock" assumption

Finally, there is the notion of logical time. Here, we don't use clocks at all and instead track causality in some other way. Remember, a timestamp is simply a shorthand for the state of the world up to that point - so we can use counters and communication to determine whether something happened before, after or concurrently with something else.

This way, we can determine the order of events between different machines, but cannot say anything about intervals and cannot use timeouts (since we assume that there is no "time sensor"). This is a partial order: events can be ordered on a single system using a counter and no communication, but ordering events across systems requires a message exchange.

One of the most cited papers in distributed systems is Lamport's paper on?time, clocks and the ordering of events. Vector clocks, a generalization of that concept (which I will cover in more detail), are a way to track causality without using clocks. Cassandra's cousins Riak (Basho) and Voldemort (Linkedin) use vector clocks rather than assuming that nodes have access to a global clock of perfect accuracy. This allows those systems to avoid the clock accuracy issues mentioned earlier.

When clocks are not used, the maximum precision at which events can be ordered across distant machines is bound by communication latency.


I hope you found the article useful.

Happy Coding :)

要查看或添加评论,请登录

Umang Agarwal的更多文章

  • Push and Pull Configuration Management Tools

    Push and Pull Configuration Management Tools

    Push and pull configuration management tools are software solutions that facilitate the management and distribution of…

    10 条评论
  • What is Configuration Management in DevOps?

    What is Configuration Management in DevOps?

    Configuration management in DevOps refers to the process of managing and controlling the configuration of software…

    3 条评论
  • Principles of Web Distributed Systems Design

    Principles of Web Distributed Systems Design

    Like most things in life, taking the time to plan ahead when building a web service can help in the long run;…

  • What is non-monotonicity good for?

    What is non-monotonicity good for?

    The difference between monotonicity and non-monotonicity is interesting. For example, adding two numbers is monotonic…

  • Lamport Clocks

    Lamport Clocks

    Assuming that we cannot achieve accurate clock synchronization - or starting with the goal that our system should not…

  • Two Phase Commit (2PC)

    Two Phase Commit (2PC)

    Two phase commit (2PC) is a protocol used in many classic relational databases. For example, MySQL Cluster (not to be…

  • Partition and Replication

    Partition and Replication

    The manner in which a data set is distributed between multiple nodes is very important. In order for any computation to…

  • Forward Proxy vs Reverse Proxy Servers

    Forward Proxy vs Reverse Proxy Servers

    Forward Proxy A forward proxy is an intermediary that sits between one or more user devices and the internet. Instead…

  • Remote Procedure Calls (RPC)

    Remote Procedure Calls (RPC)

    In distributed computing, a remote procedure call (RPC) is when a computer program causes a procedure (subroutine) to…

  • Types of NoSQL databases

    Types of NoSQL databases

    NoSQL is a collection of data items represented in a key-value store, document store, wide column store, or a graph…

社区洞察

其他会员也浏览了