登录查看更多内容

Guide to Kubernetes StatefulSet – When to Use It and Examples

Gopal Das

Corporate Trainer | Gitlab | Kubernetes Consultant & Trainer | Ansible | Docker | Terraform with AWS & Azure | Devops | Jenkins | Azure Devops | AWS Devops | HashiCorp Vault

发布日期: 2024年11月27日

Kubernetes automates container management tasks so you can efficiently deploy and scale your workloads. It can distribute your containers across clusters of hundreds or thousands of Nodes.

Most developers begin using Kubernetes for stateless apps. A stateless system doesn’t modify its environment or write any persistent data. These components are easy to deploy to Kubernetes because their container instances are interchangeable.

Kubernetes can also be used for stateful systems, though. This could be a database, a backend that writes files to persistent volumes, or a service where one replica is elected the leader to gain control of its neighbors.

In this article, you’ll learn how to use StatefulSet objects to reliably manage state in your cluster.

What are Kubernetes StatefulSets?

StatefulSets are used to manage stateful applications that require persistent storage, stable unique network identifiers, and ordered deployment and scaling. They are very useful for databases and data stores that require persistent storage or for distributed systems and consensus-based applications such as etcd and ZooKeeper.

A StatefulSet’s YAML manifest defines a template for its Pods. Kubernetes automatically creates, replaces, and deletes Pods as you scale the StatefulSet, while preserving any previously assigned identities.?

StatefulSets provide several advantages over the ReplicaSet and Deployment controllers used for stateless Pods:

Reliable replica identifiers. Each Pod in a StatefulSet is allocated a persistent identifier. The Pod will retain its identifier even if it’s replaced or rescheduled, ensuring the new Pod runs with the same characteristics.
Stable storage access. Pods in a StatefulSet are individually assigned their own Persistent Volume claims. The Pod’s volume will be reattached after it’s rescheduled, providing stable storage access after a rollout or scaling operation.
Rolling updates in a guaranteed order. StatefulSets support automated rolling updates in the order that Pods were created. You can predict the order in which an update will apply, with newer Pods only replaced once older ones have updated.
Consistent network identities. Pods in StatefulSets have reliable network identities. Their hostnames include their numerical replica identifier, allowing external applications to interact with the same replica after a Pod’s rescheduled.

StatefulSet vs. DaemonSet vs. Deployment

While all three are pretty similar, and their main purpose is to create pods based on your configuration, they are used for the following:

StatefulSets are used for stateful applications, and they maintain a sticky identity for each of their pods.
DaemonSet are used to keep a copy of a pod on all the nodes inside the cluster, making them a great choice for node-level services.
Deployments manage stateless applications, providing declarative updates to applications with capabilities for scaling, rolling updates, and rollbacks.

When to use StatefulSets?

StatefulSets should be used when you’re deploying an application that requires stable identities for its Pods. Reach for a StatefulSet instead of a ReplicaSet or Deployment if your system will be disrupted when a specific Pod replica is replaced.

Replicated databases are a good example of the scenarios that StatefulSets accommodate. One Pod acts as the primary database node, handling both read and write operations, while additional Pods are deployed as read-only replicas.

Although each Pod may run the same container image, each one needs special configuration to set whether it’s in primary or read-only mode. This means your Pods possess their own state:

postgres-0 – Primary node (read-write).
postgres-1 – Read-only replica.
postgres-2 – Read-only replica.

Regular ReplicaSets and Deployments aren’t suitable for this situation. Scaling down a Deployment removes arbitrary Pods, which could include the primary node in your database system. When you use a StatefulSet, Kubernetes terminates Pods in the opposite order to their creation. This ensures it’ll be postgres-2 that’s destroyed first.

Several other StatefulSet features also apply to this example:

The applications that use your database need to reliably connect to the primary node, so they can both read and write data. The StatefulSet’s stable network identifiers ensure postgres-0.service.namespace.svc.cluster.local will always map to the primary Node, even after scaling or replacing your Pods.
The read-only replicas shouldn’t start until after the primary is up. StatefulSets use rolling updates so each successive Pod is only created when the previous one is ready. This ensures there’s data available to replicate.
Each replica has its own sticky volume for storage. The persistent data stored by each replica is bound to its Pod. The version of the database that postgres-1 has replicated needs to be maintained separately to the copy held by postgres-2. StatefulSets can handle this requirement.

See the difference between Statefulset and Deployment.

StatefulSet example: Running PostgreSQL in Kubernetes

Ready to put this example into practice? Here’s how to run three replicas of PostgreSQL in Kubernetes using a StatefulSet.

####first we want to create dynamic storage and manage the local disk resources.

###to do the same simply install the fillowing utilitly

kubectl get sc

###currenly there is no storage class

领英推荐

Why You Should Consider Event-Driven Architecture And…

Vintage 7 个月前

CAP Theorem: Understanding Trade-Offs in Distributed…

Netopia Solutions 9 个月前

May 2023: Metamorphic testing, Oracle migrations, and…

Cockroach Labs 1 年前

###storage class in k8s are being used to dynamically create storage in the cloud.

##below mentioned utility will create a local storage class

kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.24/deploy/local-path-storage.yaml

##########

kubectl get sc

####this will show you the storage class whcih is being created

local-path rancher.io/local-path Delete WaitForFirstConsumer false 3h1m

Creating a StatefulSet

First, create a headless service for your deployment. A headless service is a service that defines a port binding but has its clusterIP set to None. StatefulSets require you to create a headless service to control their network identities.

Copy the following YAML and save it as postgres-service.yaml in your working directory:

apiVersion: v1
kind: Service
metadata:
  name: postgres
  labels:
    app: postgres
spec:
  ports:
    - name: postgres
      port: 5432
  clusterIP: None
  selector:
    app: postgres

Use Kubectl to add the service to your cluster:

$ kubectl apply -f postgres-service.yaml
service/postgres created

Next, copy the following YAML to postgres-statefulset.yaml. It defines a StatefulSet that runs three replicas of the postgres:latest image.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
spec:
  selector:
    matchLabels:
      app: postgres
  serviceName: postgres
  replicas: 3
  template:
    metadata:
      labels:
        app: postgres
    spec:
      initContainers:
        - name: postgres-init
          image: postgres:latest
          command:
          - bash
          - "-c"
          - |
            set -ex
            [[ `hostname` =~ -([0-9]+)$ ]] || exit 1
            ordinal=${BASH_REMATCH[1]}
            if [[ $ordinal -eq 0 ]]; then
              printf "I am the primary"
            else
              printf "I am a read-only replica"
            fi
      containers:
        - name: postgres
          image: postgres:latest
          env:
            - name: POSTGRES_USER
              value: postgres
            - name: POSTGRES_PASSWORD
              value: postgres
            - name: POD_IP
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: status.podIP
          ports:
          - name: postgres
            containerPort: 5432
          livenessProbe:
            exec:
              command:
                - "sh"
                - "-c"
                - "pg_isready --host $POD_IP"
            initialDelaySeconds: 30
            periodSeconds: 5
            timeoutSeconds: 5
          readinessProbe:
            exec:
              command:
                - "sh"
                - "-c"
                - "pg_isready --host $POD_IP"
            initialDelaySeconds: 5
            periodSeconds: 5
            timeoutSeconds: 1
          volumeMounts:
          - name: data
            mountPath: /var/lib/postgresql/data
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
     storageClassName: "local-path"
      resources:
        requests:
          storage: 1Gi

Apply the manifest to your cluster to create your StatefulSet:

$ kubectl apply -f postgres-statefulset.yaml
statefulset.apps/postgres created

Now you can list the Pods running in your cluster. The names of the three Pods from your StatefulSet will be suffixed with the sequential index they’ve been assigned:

$ kubectl get pods
NAME         READY   STATUS    RESTARTS   AGE
postgres-0   1/1     Running   0          74s
postgres-1   1/1     Running   0          63s
postgres-2   1/1     Running   0          51s

The StatefulSet creates each Pod in order, once the previous one has entered the Running state. This ensures the replicas don’t start until the previous Pod is ready to synchronize data. If a ReplicaSet had been used, all three Pods would have been created at the same time.

The StatefulSet uses init containers to determine whether new Pods are the Postgres primary or a replica. Each init container inspects its numeric index assigned by the StatefulSet controller; if it’s 0, the Pod is the first in the StatefulSet, so it becomes the primary database node.

Otherwise, it’s a replica:

$ kubectl logs postgres-0 -c postgres-init
I am the primary

$ kubectl logs postgres-1 -c postgres-init
I am a read-only replica

This demonstrates how StatefulSets let you consistently designate Pods as having a specific role. In a real-life Postgres example, you’d use your init containers to set up database replication from the primary Pod to the replicas. When the Pod’s index is 0, it should be configured as the primary; when it’s a higher number, the Pod is a replica that needs to synchronize the existing data and run in read-only mode.

Each Pod in the StatefulSet gets its own Persistent Volume and Persistent Volume Claim. These are created using the manifest template defined in the StatefulSet’s volumeClaimTemplates field.

$ kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                          STORAGECLASS   REASON   AGE
pvc-6b48180c-0728-4666-aea9-12e0960f732e   1Gi        RWO            Delete           Bound    postgres-sts/data-postgres-0   standard                10m
pvc-83fc4a44-4927-454e-83e8-2c2f4c80af07   1Gi        RWO            Delete           Bound    postgres-sts/data-postgres-1   standard                10m
pvc-d7496cf0-97d2-405b-bf95-b28bf9bcedec   1Gi        RWO            Delete           Bound    postgres-sts/data-postgres-2   standard                10m

$ kubectl get pvc
NAME              STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
data-postgres-0   Bound    pvc-6b48180c-0728-4666-aea9-12e0960f732e   1Gi        RWO            standard       10m
data-postgres-1   Bound    pvc-83fc4a44-4927-454e-83e8-2c2f4c80af07   1Gi        RWO            standard       10m
data-postgres-2   Bound    pvc-d7496cf0-97d2-405b-bf95-b28bf9bcedec   1Gi        RWO            standard       10m

This allows the Pods to manage their own state, independently of the others in the StatefulSet.

要查看或添加评论，请登录

Gopal Das的更多文章

12 Tools that will make Kubernetes management easier in 2024

2025年2月15日

12 Tools that will make Kubernetes management easier in 2024

Kubernetes, the revolutionary container orchestration platform, has empowered developers to build, deploy, and scale…
Top 10 Kubernetes Pod Concepts That Confuse Beginners

2025年2月15日

Top 10 Kubernetes Pod Concepts That Confuse Beginners

When I first started working with Kubernetes, it felt like trying to decode a secret language. I remember the days of…
What is Kubernetes (K8s): A Comprehensive Guide

2025年2月3日

What is Kubernetes (K8s): A Comprehensive Guide

Kubernetes, often called K8s, is a powerful tool that helps manage and organize applications that run in containers. To…
Kubernetes Pod Security Standards Explained

2025年1月29日

Kubernetes Pod Security Standards Explained

In this blog, we will look into a primer on PSS and PSA concepts, that will help you to implement them in your projects…
DIY Kubernetes Is a Recipe for Mayhem

2024年12月4日

DIY Kubernetes Is a Recipe for Mayhem

Enterprises must embrace a centralized strategy to unlock Kubernetes’ full potential. oday, in late 2024, CTOs and…
Monitoring in Kubernetes: Best Practices

2024年11月20日

Monitoring in Kubernetes: Best Practices

As the adoption of Kubernetes continues to rise, so does the need for robust monitoring practices. Kubernetes…
A guide for Docker layer caching using Circle CI Gitlab CI and Github Action

2024年4月8日

A guide for Docker layer caching using Circle CI Gitlab CI and Github Action

This document will show you how to use Layer Caching in Docker to make your builds faster and how to apply it in CI/CD…
Local Disk Storage for Kubernetes

2024年4月2日

Local Disk Storage for Kubernetes

Introduction In the realm of high-performance applications, efficient disk management is crucial for meeting escalating…

1 条评论
Kubernetes Observability Boosts Productivity, Reduce Costs

2024年3月29日

Kubernetes Observability Boosts Productivity, Reduce Costs

Practical Steps to Kubernetes Observability Following these 10 simple steps can help you get or take back control of…
Kubernetes 1.29 Released with KMS V2 Improvements and nftables Support

2024年2月2日

Kubernetes 1.29 Released with KMS V2 Improvements and nftables Support

The Cloud Native Computing Foundation (CNCF) released Kubernetes 1.29 named Mandala last month.

See all articles

Guide to Kubernetes StatefulSet – When to Use It and Examples

Gopal Das

Corporate Trainer | Gitlab | Kubernetes Consultant & Trainer | Ansible | Docker | Terraform with AWS & Azure | Devops | Jenkins | Azure Devops | AWS Devops | HashiCorp Vault

What are Kubernetes StatefulSets?

StatefulSet vs. DaemonSet vs. Deployment

When to use StatefulSets?

StatefulSet example: Running PostgreSQL in Kubernetes

领英推荐

Creating a StatefulSet

Gopal Das的更多文章

社区洞察

其他会员也浏览了

Database Tools in 2024: A Year in Review

An Introduction To Kubernetes

Kafka GitOps: Opening up Kafka without giving up governance

Kafka Mastery: Essential Strategies for Scaling, Best Practices, and Cost Efficiency

Speedb Launches Enterprise RocksDB Technical Support Program

RavenDB 5.4 LTS Released, Disaster Series: Article #2, Case Study: Beatman’s Decade of Success with RavenDB… All of RavenDB’s news.

Understanding Kafka System Design: Diving into Kafka Persistence

Navigating the Scalability Maze: Ensuring Robust Performance Under Growing User Loads

Monitoring and managing Kafka: a deep dive for architects

From RDS-Centric to Distributed Systems: An Evolution Towards Eventual Consistency and Simplified Development with Managed Services

What are Kubernetes StatefulSets?

StatefulSet vs. DaemonSet vs. Deployment

When to use StatefulSets?

StatefulSet example: Running PostgreSQL in Kubernetes

领英推荐

Creating a StatefulSet

Gopal Das的更多文章

12 Tools that will make Kubernetes management easier in 2024

Top 10 Kubernetes Pod Concepts That Confuse Beginners

What is Kubernetes (K8s): A Comprehensive Guide

Kubernetes Pod Security Standards Explained

DIY Kubernetes Is a Recipe for Mayhem

Monitoring in Kubernetes: Best Practices

A guide for Docker layer caching using Circle CI Gitlab CI and Github Action

Local Disk Storage for Kubernetes

Kubernetes Observability Boosts Productivity, Reduce Costs

Kubernetes 1.29 Released with KMS V2 Improvements and nftables Support

社区洞察

其他会员也浏览了

Database Tools in 2024: A Year in Review

An Introduction To Kubernetes

Kafka GitOps: Opening up Kafka without giving up governance

Kafka Mastery: Essential Strategies for Scaling, Best Practices, and Cost Efficiency

Speedb Launches Enterprise RocksDB Technical Support Program

RavenDB 5.4 LTS Released, Disaster Series: Article #2, Case Study: Beatman’s Decade of Success with RavenDB… All of RavenDB’s news.

Understanding Kafka System Design: Diving into Kafka Persistence

Navigating the Scalability Maze: Ensuring Robust Performance Under Growing User Loads

Monitoring and managing Kafka: a deep dive for architects

From RDS-Centric to Distributed Systems: An Evolution Towards Eventual Consistency and Simplified Development with Managed Services