Etcd: A Distributed Key-Value Store

Etcd is an open-source distributed key-value store that provides shared configuration and service discovery for distributed systems. It was originally developed by CoreOS and has evolved into a versatile foundation for building distributed applications.

?Key Features

?Simple API

?Etcd provides a simple RESTful HTTP API for CRUD operations on key-value pairs using JSON

?For example:

?Consistency

  • Etcd uses the Raft consensus algorithm to replicate data across all members
  • This ensures strong consistency, all reads reflect the most recent write

High availability

  • An etcd cluster can automatically elect a new leader if the existing leader fails
  • A typical clustered deployment has 3, 5 or 7 etcd members

Watch API

  • Clients can watch a key or range of keys and receive updates
  • Allows for efficient updates

Lease API

  • Keys can be assigned leases with TTLs
  • The key is deleted when the lease expires unless refreshed
  • Useful for leader elections, locks etc

Secure

  • Connections can be secured using TLS
  • Role-based access control can restrict access

Use Cases

  • Service discovery
  • Configuration management?
  • Coordination & leader election
  • Message queues

etcd provides a simple key-value store for building distributed systems and applications. Its versatile data model makes etcd easy to integrate into modern toolchains.

Tutorial

etcdctl is a command line client for interacting with etcd. In this tutorial, we'll cover the basics of using etcdctl to manipulate keys and values in etcd.

Prerequisites

  • etcd running on localhost
  • etcdctl installed

?Put a Key-Value

?Let's start by putting a simple key called foo with value bar:

?Read a Key

?We can read back the value by getting the key:

?This will print "bar" to stdout.

?Update a Key

?To update the value, put the same key again:

?

Get foo again and we'll see it now prints "newbar".

?Delete a Key

?To delete a key use the del command:


Getting foo now will return an error, since the key no longer exists.

Expiration with Leases

etcd supports expiring keys by attaching lease IDs. First create a lease with a 10 second TTL:


?This will create a new lease and print its ID, like 12345. Next put a key with this lease attached:

?The key tempkey will now automatically expire in 10 seconds.

?Watch a Key

We can watch a key for changes using the watch command:

?

This will monitor foo and print any updates to it in real-time.

etcdctl lets us easily manipulate keys and interact with an etcd cluster using simple commands.

The RAFT Consensus Algorithm

RAFT is a consensus algorithm used for managing replicated logs in distributed systems. It provides a way for nodes in a cluster to maintain the same shared state with strong consistency guarantees.

Some key properties of RAFT:

Leader election - Nodes elect a leader using randomized timeouts. The leader serves as the source of truth for the cluster.?

Log replication - The leader accepts log entries from clients and replicates them to follower nodes. Entries are appended to the leader's log and then replicated to followers.

Safety - RAFT guarantees safety through quorum commits from the majority of nodes. Writes must be committed by a majority quorum to be considered successful.

?Membership changes - The cluster can continue operating normally during membership changes like adding/removing nodes. The consensus protocol ensures consistency.

?Leader failure - If the leader fails, a new leader is elected using the randomized timeout mechanism. This provides high availability since the cluster can continue operating.

RAFT is designed to be understandable and performant. It offers strong consistency without limiting availability. The leader node serves as the single source of truth while follower nodes process operations in parallel.

The protocol is used in many distributed systems like etcd, Kubernetes, swarm, and CockroachDB. It enables building scalable applications that require coordination and consensus between distributed components. Overall, RAFT balances safety, consistency, and high availability in an efficient consensus protocol.

Installing etcd

This guide covers how to install etcd from binaries on Linux.

Download the etcd release

?Get the latest release from the etcd releases page. For example:

Extract the binaries?

Extract the compressed file:

This will extract etcd and etcdctl to the current directory.

Configure etcd

Create an etcd configuration file:

With at least the following settings:

Run etcd

Start the etcd server process:

etcd is now installed and running! The etcdctl command can be used to interact with the server.?

Recap

The main steps to install etcd are:

  • Download release
  • Extract binaries
  • Create config file
  • Start etcd server process

Now you have a local etcd instance for testing and development.

etcd v2 vs v3

etcd v3 was a major rewrite that improved stability, performance, and new features over the v2 releases. Here are some of the key differences:

API Changes

  • v3 uses gRPC instead of HTTP for its client/server API
  • The protocol is defined in .proto files and supports streaming
  • Compatibility mode allows v2 API support

Data Model

  • v3 has a new kvstore API with byte array keys rather than JSON keys
  • Supports transactions for atomic operations
  • More efficient encoding for storage

Consensus Algorithm

  • v3 uses the Raft consensus algorithm instead of Paxos
  • Improved read consistency and linearizability

Performance

  • v3 has much better baseline performance vs v2
  • 10x higher read QPS and lower latency
  • Supports thousands of nodes vs hundreds in v2

Operability

  • v3 has improved snapshotting for backups
  • Client connection balancing
  • Improved leader election and heartbeat monitoring
  • Prometheus metrics exposed

Security

  • TLS transport encryption
  • Authentication using client certs
  • HTTP request authentication via tokens

Overall, etcd v3 provided a much more production-ready foundation with major improvements to API, performance, scalability, and security. It was a significant update for users looking for a robust key-value store.

Using ETCDCTL_API with etcdctl

etcdctl provides a command line interface for interacting with etcd. By default, modern versions of etcdctl (v3+) talk to the etcd server using gRPC. However, etcdctl provides backwards compatibility for older HTTP REST APIs through the ETCDCTL_API environment variable.

API Versions

v2 - Original v2 client that uses JSON over HTTP

v3 - Default gRPC v3 client

v3alpha - Alpha gRPC API in etcd v3.x release

Setting ETCDCTL_API

To select a specific API version, set ETCDCTL_API before running etcdctl commands:

?If ETCDCTL_API is unset, the default is to use gRPC v3.

Compatibility

The ETCDCTL_API variable allows easy switching between API versions for compatibility across etcd versions. This helps enable a graceful upgrade for applications still using older APIs while running a modern etcd backend.

Recap

The ETCDCTL_API environment variable sets the client API version for etcdctl. The available API versions are v2, v3, and v3alpha. Setting ETCDCTL_API is useful for maintaining compatibility between different versions of the etcd server and etcdctl client. If ETCDCTL_API is unset, etcdctl will default to using the gRPC v3 API. By configuring ETCDCTL_API, you can easily switch etcdctl between API versions for compatibility across different etcd releases. This helps enable a graceful upgrade for applications still using older APIs while running a modern etcd backend.

Conclusion

etcd provides a reliable key-value store for building distributed systems and applications. We covered how to use etcdctl to manipulate data in etcd, including CRUD operations on keys, leases for expiration, and watching keys for changes.

While etcd originated as a configuration backend for Kubernetes, it has evolved into a general purpose data store used in many contexts. Its simple data model and focus on consistency makes etcd a robust foundation for coordination and discovery tasks in distributed systems.

From small clusters to large geographic deployments, etcd scales predictably and offers strong safety guarantees. Features like automatic leadership election, data replication, and linearizable reads provide the reliability needed for mission-critical services.

Whether you're building microservices, distributed databases, or large-scale web applications, etcd is a proven open source tool for managing shared state. Going forward, enhancements like improved snapshotting, proxy v2/v3 support, and authentication scoping will further cement etcd's place as a core component of the modern service stack.

要查看或添加评论,请登录

Christopher Adamson的更多文章

社区洞察

其他会员也浏览了