High Availability Kubernetes Cluster with Ceph Storage Deployment

High Availability Kubernetes Cluster with Ceph Storage Deployment




Introduction

This article covers the deployment of a high-availability Kubernetes cluster with load balancing, control plane redundancy, worker nodes, and Ceph storage integration. The setup ensures fault tolerance, scalability, and seamless data management.

Cluster Architecture

The architecture consists of:

  • Load Balancers: HAProxy and Keepalived for high availability.
  • Kubernetes Masters: Three master nodes for control plane redundancy.
  • Worker Nodes: Four worker nodes for application workloads.
  • Ceph Storage: Three Ceph nodes for distributed storage.


Node Details


Virtual IP (VIP)

  • Managed by Keepalived: 172.16.16.100

Deployment Steps

1. Setting Up Load Balancer Nodes

  • Install Keepalived & HAProxy
  • Configure HAProxy for API server load balancing
  • Configure Keepalived for high availability

2. Deploying Kubernetes Cluster

  • Disable swap and configure kernel parameters
  • Install Kubernetes packages
  • Initialize the cluster and join nodes
  • Deploy networking (Calico)

3. Deploying Ceph Storage

  • Install Cephadm
  • Bootstrap Ceph cluster
  • Add storage nodes and configure OSDs
  • Enable monitoring and dashboard

Detailed Installation Steps

1. Configuring Load Balancer Nodes

Install Keepalived & HAProxy

apt update && apt install -y keepalived haproxy        

Configure Keepalived

Create a health check script /etc/keepalived/check_apiserver.sh:

cat > /etc/keepalived/check_apiserver.sh <<EOF
#!/bin/sh
errorExit() {
  echo "*** $@" 1>&2
  exit 1
}
curl --silent --max-time 2 --insecure https://localhost:6443/ -o /dev/null || errorExit "Error GET https://localhost:6443/"
if ip addr | grep -q 172.16.16.100; then
  curl --silent --max-time 2 --insecure https://172.16.16.100:6443/ -o /dev/null || errorExit "Error GET https://172.16.16.100:6443/"
fi
EOF
chmod +x /etc/keepalived/check_apiserver.sh        

Create Keepalived configuration /etc/keepalived/keepalived.conf:

cat > /etc/keepalived/keepalived.conf <<EOF
vrrp_script check_apiserver {
  script "/etc/keepalived/check_apiserver.sh"
  interval 3
  timeout 10
  fall 5
  rise 2
  weight -2
}

vrrp_instance VI_1 {
    state BACKUP
    interface eth1
    virtual_router_id 1
    priority 100
    advert_int 5
    authentication {
        auth_type PASS
        auth_pass mysecret
    }
    virtual_ipaddress {
        172.16.16.100
    }
    track_script {
        check_apiserver
    }
}
EOF        

Enable Keepalived:

systemctl enable --now keepalived        

Configure HAProxy

Update /etc/haproxy/haproxy.cfg:

cat >> /etc/haproxy/haproxy.cfg <<EOF
frontend kubernetes-frontend
  bind *:6443
  mode tcp
  option tcplog
  default_backend kubernetes-backend

backend kubernetes-backend
  option httpchk GET /healthz
  http-check expect status 200
  mode tcp
  option ssl-hello-chk
  balance roundrobin
    server kmaster1 172.16.16.101:6443 check fall 3 rise 2
    server kmaster2 172.16.16.102:6443 check fall 3 rise 2
    server kmaster3 172.16.16.103:6443 check fall 3 rise 2
EOF        

Enable and restart HAProxy:

systemctl enable haproxy && systemctl restart haproxy        

Next Steps

Next Steps

  • Kubernetes Cluster Setup: Installing required components and initializing the control plane.
  • Adding Worker Nodes: Joining additional nodes to the cluster.
  • Ceph Storage Deployment: Bootstrapping and configuring a Ceph cluster.
  • Monitoring and Logging: Setting up Prometheus and Grafana for monitoring.
  • Security Hardening: Implementing RBAC and network policies for a secure deployment.

sudo nano /etc/fstab

sudo swapoff -a

sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl gnupg

sudo tee /etc/sysctl.d/kubernetes.conf <<EOF
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
overlay
br_netfilter
EOF
sudo sysctl --system
sudo modprobe overlay
sudo modprobe br_netfilter

sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmour -o /etc/apt/trusted.gpg.d/containerd.gpg
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
sudo apt update


sudo apt install containerd.io -y

sudo containerd config default | sudo tee /etc/containerd/config.toml
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
sudo systemctl restart containerd


sudo mkdir -p /etc/apt/keyrings

curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.30/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/k8s.gpg

echo "deb [signed-by=/etc/apt/keyrings/k8s.gpg] https://pkgs.k8s.io/core:/stable:/v1.30/deb/ /" | sudo tee /etc/apt/sources.list.d/kubernetes.list

sudo apt update

sudo apt install kubelet kubeadm kubectl -y
sudo apt-mark hold kubelet kubeadm kubectl
==========================================================================
sudo kubeadm init --pod-network-cidr=192.168.0.0/16

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
        

import Kubernetes command

kubeadm token create --print-join-command

kubectl get nodes

kubectl get pods --all-namespaces --watch        
kubectl cluster-info

kubectl get nodes
kubectl get nodes -o wide
kubectl get pods --namespace kube-system

kubectl get pods --namespace kube-system -o wide

kubectl get pods --all-namespaces --watch

kubectl get pods pod1 --output=yaml

kubectl create deployment nginx --image=nginx

kubectl get all --all-namespace | more


kubectl api-resouces | more


kubectl api-resource |grep pod


#explain

kubectl explain pod | more

kubectl explain pod.spec | more

kubectl explain pod.spec.containers | more


kubectl describe nodes worker1 | more

kubectl get -h | more

kubectl get -h | more

sudo apt install bash-completion

echo "source <(kubectl completion bash)" >> ~/.bashrc

source ~/.bashrc

kubectl g[tab]
        

Deploy Ceph Using Cephadm


root@ceph1:~# curl --silent --remote-name --location https://github.com/ceph/ceph/raw/pacific/src/cephadm/cephadm
root@ceph1:~# ls
cephadm
root@ceph1:~# chmod +x cephadm        

Add repo

root@ceph1:~# ./cephadm add-repo --release pacific
root@ceph1:~# ./cephadm install
root@ceph1:~# which cephadm
/usr/sbin/cephadm        

Bootstrap a new cluster

The first step in creating a new Ceph cluster is running the cephadm bootstrap command on the Ceph cluster’s first host. The act of running the cephadm bootstrap command on the Ceph cluster’s first host creates the Ceph cluster’s first “monitor daemon”, and that monitor daemon needs an IP address. You must pass the IP address of the Ceph cluster’s first host to the ceph bootstrap command, so you’ll need to know the IP address of that host.

End of following command it will spit out username/password of dashboard.

root@ceph1:~# cephadm bootstrap --mon-ip 172.16.16.210 --allow-fqdn-hostname        

You can see, that your first node is ready.

root@ceph1:~# docker ps
CONTAINER ID   IMAGE                                     COMMAND                  CREATED        STATUS        PORTS     NAMES
f948bbd65858   quay.io/ceph/ceph-grafana:8.3.5           "/bin/sh -c 'grafana…"   6 minutes ago   Up 6 minutes             ceph-e0ed5b04-2d51-11ed-99fd-4124623d1806-grafana-ceph1
da9940de0261   quay.io/prometheus/alertmanager:v0.23.0   "/bin/alertmanager -…"   6 minutes ago   Up 6 minutes             ceph-e0ed5b04-2d51-11ed-99fd-4124623d1806-alertmanager-ceph1
5d64f507e598   quay.io/prometheus/prometheus:v2.33.4     "/bin/prometheus --c…"   6 minutes ago   Up 6 minutes             ceph-e0ed5b04-2d51-11ed-99fd-4124623d1806-prometheus-ceph1
189ba16afeef   quay.io/prometheus/node-exporter:v1.3.1   "/bin/node_exporter …"   6 minutes ago   Up 6 minutes             ceph-e0ed5b04-2d51-11ed-99fd-4124623d1806-node-exporter-ceph1
6dc14e163713   quay.io/ceph/ceph                         "/usr/bin/ceph-crash…"   6 minutes ago   Up 6 minutes             ceph-e0ed5b04-2d51-11ed-99fd-4124623d1806-crash-ceph1
8f2887215bf4   quay.io/ceph/ceph                         "/usr/bin/ceph-mon -…"   6 minutes ago   Up 6 minutes             ceph-e0ed5b04-2d51-11ed-99fd-4124623d1806-mon-ceph1
d555bddb6bcc   quay.io/ceph/ceph                         "/usr/bin/ceph-mgr -…"   6 minutes ago   Up 6 minutes             ceph-e0ed5b04-2d51-11ed-99fd-4124623d1806-mgr-ceph1-mmsoeo        

ou can install the ceph-common package, which contains all of the ceph commands, including ceph, rbd, mount.ceph (for mounting CephFS file systems), etc.:

root@ceph1:~# cephadm install ceph-common
        

Now you can run ceph command native.

root@ceph1:~# ceph health
HEALTH_OK
        

Adding additional hosts to the cluster

NOTES: Before adding new node to cluster i would like to set unmanage on mon role. Otherwise cephadm will auto deploy mon on ceph2. (For quorom we just need single mon)

root@ceph1:~# ceph orch apply mon --unmanaged
        

To add each new host to the cluster, perform two steps:

  1. Install the cluster’s public SSH key in the new host’s root user’s authorized_keys file:

root@ceph1:~# ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph2
        

  1. Tell Ceph that the new node is part of the cluster:

root@ceph1:~# ceph orch host add ceph2 172.16.16.211 --labels _admin
        

Wait for sometime and then you will see two mgr node in cluster.

root@ceph1:~# ceph -s
  cluster:
    id:     e0ed5b04-2d51-11ed-99fd-4124623d1806
    health: HEALTH_WARN
            OSD count 0 < osd_pool_default_size 3

  services:
    mon: 1 daemons, quorum ceph1 (age 20m)
    mgr: ceph1.mmsoeo(active, since 20m), standbys: ceph2.ewfwsb
        

Adding OSDs

This is lab so i am using loop device to create multiple OSDs to mimic fake disks. (Don’t use loop device in production)

Create LVM disk on both ceph1 and ceph2 nodes.

$ fallocate -l 200G 200GB-SSD-0.img
$ fallocate -l 200G 200GB-SSD-1.img

$ losetup -fP 200GB-SSD-0.img
$ losetup -fP 200GB-SSD-1.img

$ pvcreate /dev/loop0
$ pvcreate /dev/loop1

$ vgcreate ceph-ssd-vg /dev/loop0 /dev/loop1

$ lvcreate --size 199G --name ceph-ssd-lv-0 ceph-ssd-vg
$ lvcreate --size 199G --name ceph-ssd-lv-1 ceph-ssd-vg
        

We have two 199GB LVM disk per nodes so total 4 OSDs we can add in ceph.

root@ceph1:~# ceph orch daemon add osd ceph1:/dev/ceph-ssd-vg/ceph-ssd-lv-0
root@ceph1:~# ceph orch daemon add osd ceph1:/dev/ceph-ssd-vg/ceph-ssd-lv-1

root@ceph1:~# ceph orch daemon add osd ceph2:/dev/ceph-ssd-vg/ceph-ssd-lv-0
root@ceph1:~# ceph orch daemon add osd ceph2:/dev/ceph-ssd-vg/ceph-ssd-lv-1
        

Wait for sometime and then you can check status

root@ceph1:~# ceph osd stat
4 osds: 4 up (since 2h), 4 in (since 5h); epoch: e103

root@ceph1:~# ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME       STATUS  REWEIGHT  PRI-AFF
-1         0.77716  root default
-3         0.38858      host ceph1
 0    ssd  0.19429          osd.0       up   1.00000  1.00000
 1    ssd  0.19429          osd.1       up   1.00000  1.00000
-5         0.38858      host ceph2
 2    ssd  0.19429          osd.2       up   1.00000  1.00000
 3    ssd  0.19429          osd.3       up   1.00000  1.00000
        

Scale mon daemons

Currently we have only 2 node in ceph cluster that is why we have single mon node in cluster. I am going to add 3rd node in ceph cluster and add mon service on all 3 node for better redendency.

Current status of cluster

root@ceph1:~# ceph orch host ls
HOST   ADDR         LABELS  STATUS
ceph1 172.16.16.210  _admin
ceph2  172.16.16.212  _admin
2 hosts in cluster
        

Add a new ceph3 node in the cluster

Pre-requisite to install docker-ce on newhost (ceph3). After that copy the ceph key to newhost.

root@ceph1:~# ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph3
        

Tell Ceph that the new node is part of the cluster:

root@ceph1:~# ceph orch host add ceph3 172.16.16.212 --labels _admin
        

A few minutes later

root@ceph1:~# ceph orch host ls
HOST   ADDR         LABELS  STATUS
ceph1  172.16.16.210  _admin
ceph2  172.16.16.211  _admin
ceph3  172.16.16.212  _admin
3 hosts in cluster
        

Now lets tell cephadm to add 2 more mon daemon on ceph2/ceph3

root@ceph1:~# ceph orch daemon add mon ceph2:172.16.16.211
root@ceph1:~# ceph orch daemon add mon ceph3:172.16.16.212
        

Now, enable automatic placement of Daemons

root@ceph1:~# ceph orch apply mon --placement="ceph1,ceph2,ceph3" --dry-run
root@ceph1:~# ceph orch apply mon --placement="ceph1,ceph2,ceph3"
        

After few minutes you will see we have 3 mon daemon

root@ceph1:~# ceph mon stat
e12: 3 mons at {ceph1=[v2:172.16.16.210:3300/0,v1:10.73.0.191:6789/0],ceph2=[v2:10.73.0.192:3300/0,v1:172.16.16.211:6789/0],ceph3=[v2:10.73.0.193:3300/0,v1:172.16.16.212:6789/0]}, election epoch 48, leader 0 ceph1, quorum 0,1,2 ceph1,ceph3,ceph2
        

In more details

root@ceph1:~# ceph orch ps
NAME                 HOST   PORTS        STATUS         REFRESHED  AGE  MEM USE  MEM LIM  VERSION  IMAGE ID      CONTAINER ID
alertmanager.ceph1   ceph1  *:9093,9094  running (13d)     5m ago   2w    50.9M        -           ba2b418f427c  da9940de0261
crash.ceph1          ceph1               running (13d)     5m ago   2w    8584k        -  17.2.3   0912465dcea5  6dc14e163713
crash.ceph2          ceph2               running (13d)   110s ago   2w    7568k        -  17.2.3   0912465dcea5  046663e7d825
crash.ceph3          ceph3               running (28m)    44s ago  28m    9092k        -  17.2.3   0912465dcea5  068bcd9a1b0c
grafana.ceph1        ceph1  *:3000       running (13d)     5m ago   2w     130M        -  8.3.5    dad864ee21e9  f948bbd65858
mgr.ceph1.mmsoeo     ceph1  *:8443,9283  running (13d)     5m ago   2w     688M        -  17.2.3   0912465dcea5  d555bddb6bcc
mgr.ceph2.ewfwsb     ceph2  *:8443,9283  running (13d)   110s ago   2w     436M        -  17.2.3   0912465dcea5  17c25c7ac9d7
mon.ceph1            ceph1               running (13d)     5m ago   2w     464M    2048M  17.2.3   0912465dcea5  8f2887215bf4
mon.ceph2            ceph2               running (12m)   110s ago  12m    55.1M    2048M  17.2.3   0912465dcea5  f93695536d9e
mon.ceph3            ceph3               running (21m)    44s ago  21m    84.8M    2048M  17.2.3   0912465dcea5  2532ddaed999
node-exporter.ceph1  ceph1  *:9100       running (13d)     5m ago   2w    67.7M        -           1dbe0e931976  189ba16afeef
node-exporter.ceph2  ceph2  *:9100       running (13d)   110s ago   2w    67.5M        -           1dbe0e931976  f87b1ec6f349
node-exporter.ceph3  ceph3  *:9100       running (28m)    44s ago  28m    46.3M        -           1dbe0e931976  283c6d21ea9c
osd.0                ceph1               running (13d)     5m ago   2w    1390M    17.2G  17.2.3   0912465dcea5  3456c126e322
osd.1                ceph1               running (13d)     5m ago   2w    1373M    17.2G  17.2.3   0912465dcea5  7c1fa2662443
osd.2                ceph2               running (13d)   110s ago   2w    1534M    18.7G  17.2.3   0912465dcea5  336e424bbbb2
osd.3                ceph2               running (13d)   110s ago   2w    1506M    18.7G  17.2.3   0912465dcea5  874a811e1f3b
prometheus.ceph1     ceph1  *:9095       running (28m)     5m ago   2w     122M        -           514e6a882f6e  93972e9bdfa9
        

Ceph Maintenance Options

To perform any kind of maintenance on OSD nodes you can use following flags

ceph osd set noout
ceph osd set norebalance
ceph osd set norecover
        

To remove maintenance mode

ceph osd unset noout
ceph osd unset norebalance
ceph osd unset norecover
        

In next post I will cover how to integrate ceph with kolla-ansible deployment. Enjoy!

要查看或添加评论,请登录

Reza Bojnordi的更多文章