登录查看更多内容

High Availability Kubernetes Cluster with Ceph Storage Deployment

Reza Bojnordi

Site Reliability Engineer @ BCW Group | Solutions Architect Google Cloud and OpenStack and Ceph Storage

发布日期: 2025年3月4日

Introduction

This article covers the deployment of a high-availability Kubernetes cluster with load balancing, control plane redundancy, worker nodes, and Ceph storage integration. The setup ensures fault tolerance, scalability, and seamless data management.

Cluster Architecture

The architecture consists of:

Load Balancers: HAProxy and Keepalived for high availability.
Kubernetes Masters: Three master nodes for control plane redundancy.
Worker Nodes: Four worker nodes for application workloads.
Ceph Storage: Three Ceph nodes for distributed storage.

Node Details

Virtual IP (VIP)

Managed by Keepalived: 172.16.16.100

Deployment Steps

1. Setting Up Load Balancer Nodes

Install Keepalived & HAProxy
Configure HAProxy for API server load balancing
Configure Keepalived for high availability

2. Deploying Kubernetes Cluster

Disable swap and configure kernel parameters
Install Kubernetes packages
Initialize the cluster and join nodes
Deploy networking (Calico)

3. Deploying Ceph Storage

Install Cephadm
Bootstrap Ceph cluster
Add storage nodes and configure OSDs
Enable monitoring and dashboard

Detailed Installation Steps

1. Configuring Load Balancer Nodes

Install Keepalived & HAProxy

apt update && apt install -y keepalived haproxy

Configure Keepalived

Create a health check script /etc/keepalived/check_apiserver.sh:

cat > /etc/keepalived/check_apiserver.sh <<EOF
#!/bin/sh
errorExit() {
  echo "*** $@" 1>&2
  exit 1
}
curl --silent --max-time 2 --insecure https://localhost:6443/ -o /dev/null || errorExit "Error GET https://localhost:6443/"
if ip addr | grep -q 172.16.16.100; then
  curl --silent --max-time 2 --insecure https://172.16.16.100:6443/ -o /dev/null || errorExit "Error GET https://172.16.16.100:6443/"
fi
EOF
chmod +x /etc/keepalived/check_apiserver.sh

Create Keepalived configuration /etc/keepalived/keepalived.conf:

cat > /etc/keepalived/keepalived.conf <<EOF
vrrp_script check_apiserver {
  script "/etc/keepalived/check_apiserver.sh"
  interval 3
  timeout 10
  fall 5
  rise 2
  weight -2
}

vrrp_instance VI_1 {
    state BACKUP
    interface eth1
    virtual_router_id 1
    priority 100
    advert_int 5
    authentication {
        auth_type PASS
        auth_pass mysecret
    }
    virtual_ipaddress {
        172.16.16.100
    }
    track_script {
        check_apiserver
    }
}
EOF

Enable Keepalived:

systemctl enable --now keepalived

Configure HAProxy

Update /etc/haproxy/haproxy.cfg:

cat >> /etc/haproxy/haproxy.cfg <<EOF
frontend kubernetes-frontend
  bind *:6443
  mode tcp
  option tcplog
  default_backend kubernetes-backend

backend kubernetes-backend
  option httpchk GET /healthz
  http-check expect status 200
  mode tcp
  option ssl-hello-chk
  balance roundrobin
    server kmaster1 172.16.16.101:6443 check fall 3 rise 2
    server kmaster2 172.16.16.102:6443 check fall 3 rise 2
    server kmaster3 172.16.16.103:6443 check fall 3 rise 2
EOF

Enable and restart HAProxy:

systemctl enable haproxy && systemctl restart haproxy

Next Steps

Kubernetes Cluster Setup: Installing required components and initializing the control plane.
Adding Worker Nodes: Joining additional nodes to the cluster.
Ceph Storage Deployment: Bootstrapping and configuring a Ceph cluster.
Monitoring and Logging: Setting up Prometheus and Grafana for monitoring.
Security Hardening: Implementing RBAC and network policies for a secure deployment.

sudo nano /etc/fstab

sudo swapoff -a

sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl gnupg

sudo tee /etc/sysctl.d/kubernetes.conf <<EOF
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
overlay
br_netfilter
EOF
sudo sysctl --system
sudo modprobe overlay
sudo modprobe br_netfilter

sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmour -o /etc/apt/trusted.gpg.d/containerd.gpg
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
sudo apt update


sudo apt install containerd.io -y

sudo containerd config default | sudo tee /etc/containerd/config.toml
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
sudo systemctl restart containerd


sudo mkdir -p /etc/apt/keyrings

curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.30/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/k8s.gpg

echo "deb [signed-by=/etc/apt/keyrings/k8s.gpg] https://pkgs.k8s.io/core:/stable:/v1.30/deb/ /" | sudo tee /etc/apt/sources.list.d/kubernetes.list

sudo apt update

sudo apt install kubelet kubeadm kubectl -y
sudo apt-mark hold kubelet kubeadm kubectl
==========================================================================
sudo kubeadm init --pod-network-cidr=192.168.0.0/16

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml

import Kubernetes command

kubeadm token create --print-join-command

kubectl get nodes

kubectl get pods --all-namespaces --watch

kubectl cluster-info

kubectl get nodes
kubectl get nodes -o wide
kubectl get pods --namespace kube-system

kubectl get pods --namespace kube-system -o wide

kubectl get pods --all-namespaces --watch

kubectl get pods pod1 --output=yaml

kubectl create deployment nginx --image=nginx

kubectl get all --all-namespace | more


kubectl api-resouces | more


kubectl api-resource |grep pod


#explain

kubectl explain pod | more

kubectl explain pod.spec | more

kubectl explain pod.spec.containers | more


kubectl describe nodes worker1 | more

kubectl get -h | more

kubectl get -h | more

sudo apt install bash-completion

echo "source <(kubectl completion bash)" >> ~/.bashrc

source ~/.bashrc

kubectl g[tab]

Deploy Ceph Using Cephadm


root@ceph1:~# curl --silent --remote-name --location https://github.com/ceph/ceph/raw/pacific/src/cephadm/cephadm
root@ceph1:~# ls
cephadm
root@ceph1:~# chmod +x cephadm

Add repo

root@ceph1:~# ./cephadm add-repo --release pacific
root@ceph1:~# ./cephadm install
root@ceph1:~# which cephadm
/usr/sbin/cephadm

Bootstrap a new cluster

The first step in creating a new Ceph cluster is running the cephadm bootstrap command on the Ceph cluster’s first host. The act of running the cephadm bootstrap command on the Ceph cluster’s first host creates the Ceph cluster’s first “monitor daemon”, and that monitor daemon needs an IP address. You must pass the IP address of the Ceph cluster’s first host to the ceph bootstrap command, so you’ll need to know the IP address of that host.

End of following command it will spit out username/password of dashboard.

root@ceph1:~# cephadm bootstrap --mon-ip 172.16.16.210 --allow-fqdn-hostname

You can see, that your first node is ready.

root@ceph1:~# docker ps
CONTAINER ID   IMAGE                                     COMMAND                  CREATED        STATUS        PORTS     NAMES
f948bbd65858   quay.io/ceph/ceph-grafana:8.3.5           "/bin/sh -c 'grafana…"   6 minutes ago   Up 6 minutes             ceph-e0ed5b04-2d51-11ed-99fd-4124623d1806-grafana-ceph1
da9940de0261   quay.io/prometheus/alertmanager:v0.23.0   "/bin/alertmanager -…"   6 minutes ago   Up 6 minutes             ceph-e0ed5b04-2d51-11ed-99fd-4124623d1806-alertmanager-ceph1
5d64f507e598   quay.io/prometheus/prometheus:v2.33.4     "/bin/prometheus --c…"   6 minutes ago   Up 6 minutes             ceph-e0ed5b04-2d51-11ed-99fd-4124623d1806-prometheus-ceph1
189ba16afeef   quay.io/prometheus/node-exporter:v1.3.1   "/bin/node_exporter …"   6 minutes ago   Up 6 minutes             ceph-e0ed5b04-2d51-11ed-99fd-4124623d1806-node-exporter-ceph1
6dc14e163713   quay.io/ceph/ceph                         "/usr/bin/ceph-crash…"   6 minutes ago   Up 6 minutes             ceph-e0ed5b04-2d51-11ed-99fd-4124623d1806-crash-ceph1
8f2887215bf4   quay.io/ceph/ceph                         "/usr/bin/ceph-mon -…"   6 minutes ago   Up 6 minutes             ceph-e0ed5b04-2d51-11ed-99fd-4124623d1806-mon-ceph1
d555bddb6bcc   quay.io/ceph/ceph                         "/usr/bin/ceph-mgr -…"   6 minutes ago   Up 6 minutes             ceph-e0ed5b04-2d51-11ed-99fd-4124623d1806-mgr-ceph1-mmsoeo

ou can install the ceph-common package, which contains all of the ceph commands, including ceph, rbd, mount.ceph (for mounting CephFS file systems), etc.:

root@ceph1:~# cephadm install ceph-common

Now you can run ceph command native.

root@ceph1:~# ceph health
HEALTH_OK

Adding additional hosts to the cluster

NOTES: Before adding new node to cluster i would like to set unmanage on mon role. Otherwise cephadm will auto deploy mon on ceph2. (For quorom we just need single mon)

root@ceph1:~# ceph orch apply mon --unmanaged

To add each new host to the cluster, perform two steps:

Install the cluster’s public SSH key in the new host’s root user’s authorized_keys file:

root@ceph1:~# ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph2

Tell Ceph that the new node is part of the cluster:

root@ceph1:~# ceph orch host add ceph2 172.16.16.211 --labels _admin

Wait for sometime and then you will see two mgr node in cluster.

root@ceph1:~# ceph -s
  cluster:
    id:     e0ed5b04-2d51-11ed-99fd-4124623d1806
    health: HEALTH_WARN
            OSD count 0 < osd_pool_default_size 3

  services:
    mon: 1 daemons, quorum ceph1 (age 20m)
    mgr: ceph1.mmsoeo(active, since 20m), standbys: ceph2.ewfwsb

Adding OSDs

This is lab so i am using loop device to create multiple OSDs to mimic fake disks. (Don’t use loop device in production)

Create LVM disk on both ceph1 and ceph2 nodes.

$ fallocate -l 200G 200GB-SSD-0.img
$ fallocate -l 200G 200GB-SSD-1.img

$ losetup -fP 200GB-SSD-0.img
$ losetup -fP 200GB-SSD-1.img

$ pvcreate /dev/loop0
$ pvcreate /dev/loop1

$ vgcreate ceph-ssd-vg /dev/loop0 /dev/loop1

$ lvcreate --size 199G --name ceph-ssd-lv-0 ceph-ssd-vg
$ lvcreate --size 199G --name ceph-ssd-lv-1 ceph-ssd-vg

We have two 199GB LVM disk per nodes so total 4 OSDs we can add in ceph.

root@ceph1:~# ceph orch daemon add osd ceph1:/dev/ceph-ssd-vg/ceph-ssd-lv-0
root@ceph1:~# ceph orch daemon add osd ceph1:/dev/ceph-ssd-vg/ceph-ssd-lv-1

root@ceph1:~# ceph orch daemon add osd ceph2:/dev/ceph-ssd-vg/ceph-ssd-lv-0
root@ceph1:~# ceph orch daemon add osd ceph2:/dev/ceph-ssd-vg/ceph-ssd-lv-1

Wait for sometime and then you can check status

root@ceph1:~# ceph osd stat
4 osds: 4 up (since 2h), 4 in (since 5h); epoch: e103

root@ceph1:~# ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME       STATUS  REWEIGHT  PRI-AFF
-1         0.77716  root default
-3         0.38858      host ceph1
 0    ssd  0.19429          osd.0       up   1.00000  1.00000
 1    ssd  0.19429          osd.1       up   1.00000  1.00000
-5         0.38858      host ceph2
 2    ssd  0.19429          osd.2       up   1.00000  1.00000
 3    ssd  0.19429          osd.3       up   1.00000  1.00000

Scale mon daemons

Currently we have only 2 node in ceph cluster that is why we have single mon node in cluster. I am going to add 3rd node in ceph cluster and add mon service on all 3 node for better redendency.

Current status of cluster

root@ceph1:~# ceph orch host ls
HOST   ADDR         LABELS  STATUS
ceph1 172.16.16.210  _admin
ceph2  172.16.16.212  _admin
2 hosts in cluster

Add a new ceph3 node in the cluster

Pre-requisite to install docker-ce on newhost (ceph3). After that copy the ceph key to newhost.

root@ceph1:~# ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph3

Tell Ceph that the new node is part of the cluster:

root@ceph1:~# ceph orch host add ceph3 172.16.16.212 --labels _admin

A few minutes later

root@ceph1:~# ceph orch host ls
HOST   ADDR         LABELS  STATUS
ceph1  172.16.16.210  _admin
ceph2  172.16.16.211  _admin
ceph3  172.16.16.212  _admin
3 hosts in cluster

Now lets tell cephadm to add 2 more mon daemon on ceph2/ceph3

root@ceph1:~# ceph orch daemon add mon ceph2:172.16.16.211
root@ceph1:~# ceph orch daemon add mon ceph3:172.16.16.212

Now, enable automatic placement of Daemons

root@ceph1:~# ceph orch apply mon --placement="ceph1,ceph2,ceph3" --dry-run
root@ceph1:~# ceph orch apply mon --placement="ceph1,ceph2,ceph3"

After few minutes you will see we have 3 mon daemon

root@ceph1:~# ceph mon stat
e12: 3 mons at {ceph1=[v2:172.16.16.210:3300/0,v1:10.73.0.191:6789/0],ceph2=[v2:10.73.0.192:3300/0,v1:172.16.16.211:6789/0],ceph3=[v2:10.73.0.193:3300/0,v1:172.16.16.212:6789/0]}, election epoch 48, leader 0 ceph1, quorum 0,1,2 ceph1,ceph3,ceph2

In more details

root@ceph1:~# ceph orch ps
NAME                 HOST   PORTS        STATUS         REFRESHED  AGE  MEM USE  MEM LIM  VERSION  IMAGE ID      CONTAINER ID
alertmanager.ceph1   ceph1  *:9093,9094  running (13d)     5m ago   2w    50.9M        -           ba2b418f427c  da9940de0261
crash.ceph1          ceph1               running (13d)     5m ago   2w    8584k        -  17.2.3   0912465dcea5  6dc14e163713
crash.ceph2          ceph2               running (13d)   110s ago   2w    7568k        -  17.2.3   0912465dcea5  046663e7d825
crash.ceph3          ceph3               running (28m)    44s ago  28m    9092k        -  17.2.3   0912465dcea5  068bcd9a1b0c
grafana.ceph1        ceph1  *:3000       running (13d)     5m ago   2w     130M        -  8.3.5    dad864ee21e9  f948bbd65858
mgr.ceph1.mmsoeo     ceph1  *:8443,9283  running (13d)     5m ago   2w     688M        -  17.2.3   0912465dcea5  d555bddb6bcc
mgr.ceph2.ewfwsb     ceph2  *:8443,9283  running (13d)   110s ago   2w     436M        -  17.2.3   0912465dcea5  17c25c7ac9d7
mon.ceph1            ceph1               running (13d)     5m ago   2w     464M    2048M  17.2.3   0912465dcea5  8f2887215bf4
mon.ceph2            ceph2               running (12m)   110s ago  12m    55.1M    2048M  17.2.3   0912465dcea5  f93695536d9e
mon.ceph3            ceph3               running (21m)    44s ago  21m    84.8M    2048M  17.2.3   0912465dcea5  2532ddaed999
node-exporter.ceph1  ceph1  *:9100       running (13d)     5m ago   2w    67.7M        -           1dbe0e931976  189ba16afeef
node-exporter.ceph2  ceph2  *:9100       running (13d)   110s ago   2w    67.5M        -           1dbe0e931976  f87b1ec6f349
node-exporter.ceph3  ceph3  *:9100       running (28m)    44s ago  28m    46.3M        -           1dbe0e931976  283c6d21ea9c
osd.0                ceph1               running (13d)     5m ago   2w    1390M    17.2G  17.2.3   0912465dcea5  3456c126e322
osd.1                ceph1               running (13d)     5m ago   2w    1373M    17.2G  17.2.3   0912465dcea5  7c1fa2662443
osd.2                ceph2               running (13d)   110s ago   2w    1534M    18.7G  17.2.3   0912465dcea5  336e424bbbb2
osd.3                ceph2               running (13d)   110s ago   2w    1506M    18.7G  17.2.3   0912465dcea5  874a811e1f3b
prometheus.ceph1     ceph1  *:9095       running (28m)     5m ago   2w     122M        -           514e6a882f6e  93972e9bdfa9

Ceph Maintenance Options

To perform any kind of maintenance on OSD nodes you can use following flags

ceph osd set noout
ceph osd set norebalance
ceph osd set norecover

To remove maintenance mode

ceph osd unset noout
ceph osd unset norebalance
ceph osd unset norecover

In next post I will cover how to integrate ceph with kolla-ansible deployment. Enjoy!

要查看或添加评论，请登录

Reza Bojnordi的更多文章

Tuning 10Gb network cards on Linux

2025年3月2日

Tuning 10Gb network cards on Linux

A basic introduction to concepts used to tune fast network cards Article: https://www.kernel.

3 条评论
Enhancing Storage Performance with LVM Caching (improve HDD disk)

2025年3月2日

Enhancing Storage Performance with LVM Caching (improve HDD disk)

The guide explains how to set up LVM caching, including disk preparation, logical volume creation, and integrating thin…

3 条评论
Deploying RBD Mirror in Ceph for Disaster Recovery

2025年3月2日

Deploying RBD Mirror in Ceph for Disaster Recovery

Below is a detailed article that explains how to deploy RBD mirroring for disaster recovery in Ceph, complete with…

1 条评论
Optimizing MySQL Database for High Performance in Cloud Infrastructure

2025年2月20日

Optimizing MySQL Database for High Performance in Cloud Infrastructure

Databases are the backbone of most cloud applications, and their performance can make or break your system. This…

2 条评论
How to deploy OpenStack with Kola Ansible

2025年2月19日

How to deploy OpenStack with Kola Ansible

To deploy OpenStack using Kolla with your provided configuration, here's a summary of the requirements and a detailed…
Why We Built a CDN with Nginx and how to Build CDN and Active DDos on it

2025年2月16日

Why We Built a CDN with Nginx and how to Build CDN and Active DDos on it

In today's fast-paced digital world, content delivery speed is critical. A slow website leads to high bounce rates and…
Building a Simple CDN with Nginx and Docker: A Hands-on Approach ?????

2025年2月11日

Building a Simple CDN with Nginx and Docker: A Hands-on Approach ?????

In the world of web performance, a Content Delivery Network (CDN) is essential for optimizing load times and ensuring a…
Comprehensive Ceph Hardware Recommendations for Optimal Performance and Scalability

2025年2月11日

Comprehensive Ceph Hardware Recommendations for Optimal Performance and Scalability

Introduction Deploying a Ceph cluster is not just about installing the software—it’s about building a robust foundation…
Advanced Ceph Storage Tuning Guide: Linux, Network, and Ceph Configuration

2025年2月10日

Advanced Ceph Storage Tuning Guide: Linux, Network, and Ceph Configuration

1. Linux System Tuning Optimizing the underlying Linux OS is essential for stable and high-performing Ceph storage.

2 条评论
Step-by-Step Ceph Maintenance with Practical Commands and Tips

2025年2月10日

Step-by-Step Ceph Maintenance with Practical Commands and Tips

Maintenance in a Ceph cluster is crucial to ensure the reliability and stability of the storage system without…

2 条评论

See all articles

Introduction

Cluster Architecture

Node Details

Virtual IP (VIP)

Deployment Steps

1. Setting Up Load Balancer Nodes

2. Deploying Kubernetes Cluster

3. Deploying Ceph Storage

Detailed Installation Steps

1. Configuring Load Balancer Nodes

Install Keepalived & HAProxy

Configure Keepalived

Configure HAProxy

Next Steps

Next Steps

import Kubernetes command

Deploy Ceph Using Cephadm

Add repo

Bootstrap a new cluster

Adding additional hosts to the cluster

Adding OSDs

Scale mon daemons

Add a new ceph3 node in the cluster

Ceph Maintenance Options

Reza Bojnordi的更多文章

Tuning 10Gb network cards on Linux

Enhancing Storage Performance with LVM Caching (improve HDD disk)

Deploying RBD Mirror in Ceph for Disaster Recovery

Optimizing MySQL Database for High Performance in Cloud Infrastructure

How to deploy OpenStack with Kola Ansible

Why We Built a CDN with Nginx and how to Build CDN and Active DDos on it

Building a Simple CDN with Nginx and Docker: A Hands-on Approach ?????

Comprehensive Ceph Hardware Recommendations for Optimal Performance and Scalability

Advanced Ceph Storage Tuning Guide: Linux, Network, and Ceph Configuration

Step-by-Step Ceph Maintenance with Practical Commands and Tips