Kubernetes

Kubernetes has revolutionized the orchestration of containerized applications by providing robust mechanisms for automating deployment, scaling, and operations. A critical component of stateful application management within Kubernetes is persistent storage. This article explores the concepts of Persistent Volumes (PVs) and Persistent Volume Claims (PVCs), the essential abstractions that Kubernetes offers to handle storage resources. We discuss their definitions, lifecycle, and interaction with the broader Kubernetes ecosystem, shedding light on how they enable state persistence in a dynamic, distributed environment.

Introduction

As containerized applications continue to gain popularity, managing persistent storage in these ephemeral environments has emerged as a significant challenge. Kubernetes addresses this challenge with two key abstractions: Persistent Volumes (PVs) and Persistent Volume Claims (PVCs). These abstractions separate storage concerns-such as provisioning, lifecycle management, and capacity-from the application logic, allowing developers to focus on building scalable and resilient applications.

Understanding Persistent Volumes (PVs)

A PV in Kubernetes is a representation of a piece of storage that has been provisioned by an administrator or dynamically through storage plugins. It is a cluster-level resource, meaning it is not tied to a particular namespace and can be claimed by PVCs across different namespaces. PVs support various storage backends, including block storage, file systems, and object storage, provided by on-premises solutions or cloud storage services.

PVs embody the following characteristics:

Capacity: The storage size of the PV, which is defined at the time of creation.

Access Modes: The ways in which the PV can be mounted on a pod, such as ReadWriteOnce (RWO), ReadOnlyMany (ROX), and ReadWriteMany (RWX).

Reclaim Policy: The policy that dictates what happens to the PV's data after the associated PVC is deleted. Common policies include Retain, Delete, and Recycle.

Storage Class: An optional attribute linking the PV to a particular storage class, which defines provisioning policies and other storage parameters.

Lifecycle of Persistent Volumes

The lifecycle of a PV begins with its creation and provisioning. Once created, a PV can be in one of the following states:

Available: The PV is not yet bound to any PVC and is available for claiming.

Bound: The PV has been claimed by a PVC and is no longer available for new claims.

Released: The PVC associated with the PV has been deleted, but the underlying storage resource is not yet reclaimed.

Failed: The PV has encountered an error during automatic reclamation.

Understanding Persistent Volume Claims (PVCs)

A PVC is a request for storage by a user (typically a developer or an application). It specifies the size and access modes, among other storage attributes. PVCs are namespaced resources, meaning they belong to a specific namespace and can only be accessed by pods within the same namespace.

When a PVC is created, the Kubernetes control plane looks for an available PV that satisfies the claim's requirements. If a suitable PV is found, the PVC binds to it, creating a one-to-one mapping between the PV and PVC. If no suitable PV exists and dynamic provisioning is configured, a new PV is dynamically provisioned to satisfy the claim.

Interaction Between PVs and PVCs

The relationship between PVs and PVCs is fundamental to managing stateful workloads in Kubernetes. This interaction allows for:

Decoupling: Applications are decoupled from the underlying storage infrastructure, simplifying development and deployment.

Portability: The use of abstracted storage resources enables workload portability across different environments and cloud providers.

Scalability: The dynamic provisioning of storage resources allows applications to scale seamlessly without manual intervention from administrators.

Creating a Persistent Volume (PV) on a GlusterFS brick involves a few steps, including setting up your GlusterFS cluster and bricks. A brick in GlusterFS is a basic unit of storage, corresponding to a directory on a server in the storage network. Once your GlusterFS cluster is ready, you can create a PV that references the GlusterFS volume (composed of one or more bricks).

Here are the general steps to create a PV on a GlusterFS brick:

1. Set Up a GlusterFS Cluster:

Ensure that you have a running GlusterFS cluster with at least one volume created and started. GlusterFS volumes are made from bricks, where each brick is a directory on a server in the storage network.

2. Retrieve GlusterFS Volume Information:

You'll need the following information about your GlusterFS volume:

? The volume name

? Endpoints (the list of IP addresses or hostnames of the GlusterFS servers)

3. Create GlusterFS Endpoints and Service in Kubernetes:

Create an Endpoints resource that lists the IP addresses of your GlusterFS servers, and then create a Service that points to the Endpoints. Endpoints YAML (glusterfs-endpoints.yaml):

apiVersion: v1
kind: Endpoints
metadata:
  name: glusterfs-cluster
subsets:
  - addresses:
      - ip: <glusterfs-server-1-ip>
      - ip: <glusterfs-server-2-ip>
      - ip: <glusterfs-server-n-ip>
    ports:
      - port: 1

Service YAML (glusterfs-service.yaml):

apiVersion: v1
kind: Service
metadata:
  name: glusterfs-cluster
ports:
  - port: 1

Apply the configurations using

kubectl:kubectl apply -f glusterfs-endpoints.yaml

kubectl apply -f glusterfs-service.yaml

4. Create the Persistent Volume:

With the endpoints in place, you can now create a PV that references the GlusterFS volume.PV YAML (glusterfs-pv.yaml):

apiVersion: v1
kind: PersistentVolume
metadata:
  name: glusterfs-pv
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteMany
  glusterfs:
    endpoints: glusterfs-cluster
    path: <glusterfs-volume-name>
    readOnly: false
  persistentVolumeReclaimPolicy: Retain

Replace <glusterfs-server-1-ip>, <glusterfs-server-2-ip>, <glusterfs-server-n-ip>, and <glusterfs-volume-name> with your actual GlusterFS server IPs and volume name. Then create the PV:

kubectl apply -f glusterfs-pv.yaml

5. Verify the Persistent Volume:

After creating the PV, check that it's available in your Kubernetes cluster:

kubectl get pv

Please note that this is a general guide and assumes that the GlusterFS volume is already set up and that your Kubernetes cluster can communicate with the GlusterFS servers. Always refer to the official GlusterFS and Kubernetes documentation for detailed instructions tailored to your specific environment and version.

To create a Persistent Volume (PV) on a file system in Kubernetes, you need to define a PV resource that specifies the details of the storage on which it is hosted. Here's how to create a PV using a directory from a node's local filesystem, which could be a mounted disk or any directory accessible to the node.

Important: Using local storage ties your PV to a specific node, which limits the portability of the pods that use it. Furthermore, data on local storage is not replicated, so it is not suitable for production workloads that require high availability.

Here is an example of how you can create a PV using a directory from the local filesystem:

Prepare the Storage on the Node:

Choose a directory on your node that you want to expose as a PV. For instance, you might have a directory at /mnt/data that you wish to use.Make sure that this directory exists and has the proper permissions set:

  sudo mkdir -p /mnt/data
  sudo chown -R nobody:nogroup /mnt/data
  sudo chmod 0777 /mnt/data

2. Define the Persistent Volume:

Create a YAML file for your PV, such as local-pv.yaml, and define the PV resource:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: local-pv
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: local-storage
  local:
    path: /mnt/data
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/hostname
              operator: In
              values:
                - <node-name>

In the above YAML file, replace <node-name> with the name of the node where the storage is located. This will ensure that the PV is only available to pods running on that specific node.

Using kubectl get nodes with JSON or YAML output:

You can also output the information in JSON or YAML format and then use tools like jq for JSON processing to extract node names:

kubectl get nodes -o json | jq '.items[].metadata.name'

This jq command will give you a list of node names in quotes:"node1" "node2" "node3"

For YAML output, you would use:

kubectl get nodes -o yaml

You can then manually look through the YAML output for the node names, or use a tool like yq to parse YAML from the command line.

3. Create the Persistent Volume

Apply the configuration to your cluster:

kubectl apply -f local-pv.yaml

4. Verify the Persistent Volume

After creating the PV, you can check its status with the following command:

kubectl get pv local-pv

5. Using the Persistent Volume

To use this PV, a pod needs to create a Persistent Volume Claim (PVC) that requests storage of the appropriate size and access modes. Here is an example PVC that could be used to claim the local PV:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: local-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: local-storage
  resources:
    requests:
      storage: 5Gi

The storageClassName in the PVC should match the storageClassName defined in your local PV. This is what the Kubernetes scheduler uses to bind the PVC to the appropriate PV.

Once you have created the PVC, you can reference it in the volumes section of a pod's spec to mount the local storage:

volumes:
  - name: local-storage
    persistentVolumeClaim:
      claimName: local-pvc

Remember that when using local volumes, if the node fails or the pod is rescheduled to another node, the data will not be accessible from the new node. Local volumes are typically used for temporary storage or in situations where the application can handle node-specific storage and potential data loss.

To create pods that use a PVC through a Deployment, you need to define a Deployment resource in Kubernetes. The Deployment will specify a template for pod creation, which includes volume mounts that refer to your PVC.

Here's an example of how you can set this up:

Ensure you have a PersistentVolumeClaim (PVC)

Before you can use a PVC in your Deployment, you need to have an existing PVC in your Kubernetes cluster. Here's an example YAML definition for a PVC:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
  namespace: default
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

Apply this definition with kubectl:

kubectl apply -f my-pvc.yaml

Make sure the PVC is bound to a PersistentVolume (PV) and is ready for use.

2. Create a Deployment that uses the PVC:

Define a Deployment YAML that includes a volume mount for the PVC. Here's an example:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-deployment
  namespace: default
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
        - name: my-container
          image: nginx
          ports:
            - containerPort: 80
          volumeMounts:
            - mountPath: /usr/share/nginx/html
              name: my-volume
      volumes:
        - name: my-volume
          persistentVolumeClaim:
            claimName: my-pvc

In this example, the Deployment will create pods with a container running Nginx, and the PVC will be mounted at /usr/share/nginx/html.

Apply the Deployment with kubectl:

kubectl apply -f my-deployment.yaml

3. Verify the Deployment and Pods:

Check the status of your Deployment and pods to ensure they are running and that the PVC is correctly mounted:

kubectl get deployment my-deployment

kubectl get pods --selector=app=my-app

You can also describe one of the pods to see more details about the volume mounts:

kubectl describe pod <pod-name>Replace <pod-name> with the actual name of one of your pods.

By following these steps, you'll have a Kubernetes Deployment that creates pods using a PVC for persistent storage. Remember to adjust the image, volume mount path, and any other configuration details to match the specific needs of your application.

Ways of associating PV to PVC

In Kubernetes, a Persistent Volume Claim (PVC) is typically bound to a Persistent Volume (PV) using the storage class and the capacity requirements specified in the PVC. However, there are other ways to associate a PVC with a PV, which can be useful in scenarios where you need more control over the binding process. Here are some alternative methods:

1. Manual Static Provisioning:

When you manually pre-provision PVs, you can ensure that a specific PVC binds to a particular PV by matching the accessModes and resources.requests.storage values. In addition, you can use labels and selectors to make the match more explicit.Example PV with a custom label:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-pv
  labels:
    type: local
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteOnce
  storageClassName: manual
  local:
    path: /mnt/disks/ssd1
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/hostname
              operator: In
              values:
                - your-node-name

Example PVC with a selector that matches the label:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: manual
  resources:
    requests:
      storage: 5Gi
  selector:
    matchLabels:
      type: local

In this example, the PVC will only bind to PVs with a label type: local.

2. VolumeName Field in PVC:

You can explicitly specify the name of the PV you want your PVC to bind to by setting the volumeName field in the PVC spec.Example PVC with volumeName set:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  volumeName: my-pv

This method directly associates the PVC with the specified PV, bypassing the usual dynamic provisioning process.

3. StorageClass and VolumeBindingMode:

By setting the volumeBindingMode field in a StorageClass to WaitForFirstConsumer, you can delay the binding and provisioning of a PV until a pod that uses the PVC is created. This can be useful for local volumes where the PV must be on the same node as the pod.Example StorageClass with WaitForFirstConsumer:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-wait
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

PVCs that reference this StorageClass will wait to bind until a pod requests the PVC.

4. Pre-Binding PVC to PV:

You can pre-bind a PVC to a PV before creating the PVC by specifying claimRef in the PV spec. This method is not commonly used because it requires manual intervention and careful coordination.Example PV with claimRef set:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-pv
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  claimRef:
    namespace: default
    name: my-pvc
  local:
    path: /mnt/disks/ssd1
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/hostname
              operator: In
              values:
                - your-node-name

When the PVC my-pvc is created, it will automatically bind to my-pv.

Each of these methods provides a different level of control over the binding process between PV

Conclusion

Persistent Volumes and Persistent Volume Claims are pivotal components in the Kubernetes storage model, facilitating the deployment and management of stateful applications. By abstracting storage details away from the application layer, Kubernetes enables a more flexible, efficient, and developer-friendly approach to persistent storage. As the Kubernetes ecosystem continues to evolve, PVs and PVCs will remain central to its strategy for stateful workload orchestration.

References

Kubernetes Documentation. Persistent Volumes. https://kubernetes.io/docs/concepts/storage/persistent-volumes/

Kubernetes Documentation. Persistent Volume Claims. https://kubernetes.io/docs/concepts/storage/persistent-volume-claims

Kubernetes - Volumes

Abhishek Gupta

Introduction

Understanding Persistent Volumes (PVs)

Lifecycle of Persistent Volumes

Understanding Persistent Volume Claims (PVCs)

Interaction Between PVs and PVCs

领英推荐

Ways of associating PV to PVC

Conclusion

References

Abhishek Gupta的更多文章

社区洞察

其他会员也浏览了

Containerization in 2024: How It’s Enhancing Software Development and Deployment

Decoding the Key Advantages of Kubernetes

Getting Started with Kubernetes

Docker and Kubernetes: A Collaborative Approach to Containerization

OpsTeams and Kubernetes Simplifying Orchestration Challenges

Roundup of Kubernetes New Features and Enhancements

What is Kubernetes?

KUBERNETES: THE FUTURE OF CONTAINER ORCHESTRATION

Mastering Kubernetes: Empowering Organizations in the World of Containers

Introduction

Understanding Persistent Volumes (PVs)

Lifecycle of Persistent Volumes

Understanding Persistent Volume Claims (PVCs)

Interaction Between PVs and PVCs

领英推荐

Ways of associating PV to PVC

Conclusion

References

Abhishek Gupta的更多文章

Presto SQL with S3

社区洞察

其他会员也浏览了

Containerization in 2024: How It’s Enhancing Software Development and Deployment

Decoding the Key Advantages of Kubernetes

Getting Started with Kubernetes

Docker and Kubernetes: A Collaborative Approach to Containerization

OpsTeams and Kubernetes Simplifying Orchestration Challenges

Roundup of Kubernetes New Features and Enhancements

What is Kubernetes?

KUBERNETES: THE FUTURE OF CONTAINER ORCHESTRATION

Mastering Kubernetes: Empowering Organizations in the World of Containers

Kubernetes