Kubernetes Architecture, Concepts & Best Practices

Why Kubernetes?

If we want to know why and when to use Kubernetes, then we need to step back and understand the difference between Dockers and Kubernetes.

Let us start with a simple cloud native application. Say, the front end of the application is written in React backed by Node.js. Let’s say that the database will access the application using Java for database access, while we use Python for accessing any external API, which allows us to serve REST endpoints.

Putting my DevOps-engineer-role hat on, I will use a pure Docker approach to deploy this application and create the application stack.

So, we will have the basic hardware, then the OS (Ubuntu), then we will have the docker daemon installed on top of the OS, which will allow us to spin up containers.

Then we will have the Node.js, Java application, as well as the Python application as microservices deployed in the containers.

But, let us imagine that the application started to become more popular and hence starts to get a lot more load. Many more people are using it, and we realize that we need to scale out to be able to provide a better user experience.

So, as a DevOps engineer, the first instinct might be:

“I’ve already got scripts to make this stack, so let’s simply get some new hardware and do the exact same deployment multiple times.”

This can fall apart for many reasons when we start moving to scale.

For example, what if the DevOps team has to create a new microservice to support a new requirement – where do we piece those in, especially if we have effectively used the available hardware, compute, and storage space?

In addition, a big advantage of microservice-based applications is being able to scale out components individually. So, a DevOps engineer would have to write scripts to find the most effective way to scale things out in response to load to identify and resolve user experience issues when moving to scale.

So, this is where an orchestration tool like Kubernetes comes in, which is going to allow us to use our existing Dockerize application but orchestrate and make more effective use of your compute and storage space.

Kubernetes makes current and future deployment easier and is an orchestration tool for docker- based applications (microservices).

For any scale out or scale in events, Kubernetes will automatically spin up instances with the required number of Pods and containers in them. All microservices can be termed as “Services,” so that the applications does not need to bother.

Why is Kubernetes called K8s?

The abbreviation K8s is derived by replacing the eight letters of “ubernete” with the digit 8.

The Kubernetes project was open sourced by Google in 2014 after using it to run production workloads for more than a decade. But now, the Cloud Native Computing Foundation maintains the project.

The Kubernetes project maintains release branches for the most recent three minor releases. Kubernetes versions are expressed as?x.y.z, where?x?is the major version,?y?is the minor version, and?z?is the patch version.

Architecture of Kubernetes (K8s)

Kubernetes is an open-source container orchestration tool. It can be docker-based containers or rocket-based containers. It helps to manage container-based applications in different deployment environments like physical machines, virtual machines, and the cloud.

At high level, Kubernetes provides the following benefits:

  • Kubernetes helps in deployment, development, and monitoring
  • High availability – the application will have no downtime and will always be available to end users
  • Scalability – for high performance
  • Durability & resiliency – to always run in the desired state

K8s has a master and a number of worker nodes, which provides compute, storage, and network resources to the containers (microservices/apps). This is called a K8s cluster.

Kubernetes Master

  • API server
  • Controller manager
  • Scheduler
  • Etcd database

Kubernetes Worker Nodes

  • Kubelet – Node agent connecting to the master
  • Kubeproxy – Virtual network for connecting pods internally
  • Pods – Virtual machines
  • Container – Microservices

Kubernetes Control Panel

Kubernetes Master Node Processes

API Server (gatekeeper of the cluster)

  • Entry point to the K8s cluster from any one of the K8s clients like UI, CLI, API (Kubectl).
  • The deployment file for the Pods, containers, and replication nodes (YAML, JSON) are sent to the API server process from K8s client in the deployment template.

Controller Manager

  • Keeps tracks of what is happening in the cluster. If it finds a container not running, it will restore the same. It also ensures that the nodes are running with the correct number of Pods (VMs) as mentioned in the deployment file.
  • It checks that the “desired state==actual state”.

Scheduler

  • Controls the Pods placement in the K8s cluster. Based on the deployment file, it will spin up a Pod with the required configuration specs like 1 vCPU, 4 GB RAM, SSD, IOPs etc.
  • It will check that the appropriate worker nodes meets the deployment configuration of the Pods and schedule to create them.

Etcd

  • It is a key value light weight database. It stores the current state of the cluster and any cluster component can query the Etcd database to know the current state of the cluster versus the desired state.

Kubernetes Worker Nodes

Kubelet – Node agent connecting to the master

  • It is the primary node agent that runs on each worker node.
  • It can register the nodes with the API server for maintaining a set of Pods, which are composed of one or more containers.
  • It looks at the Pods’ specification submitted into the API server in the master node and ensures that the containers (microservices) are healthy as per the Pods’ specs.
  • If a Kubelet finds some issues with a certain Pod, it tries to restart the Pod in the same worker node.
  • If the worker node has some issues, then the K8s master detects a node failure and recreates the Pod (virtual machine) on another worker node.
  • It also depends on whether the Pod is controlled by a replica set or replication controller.

Pods – Virtual machines

  • Is the smallest unit that you can configure. Each worker node will have one or more Pods and each Pod will have one or more containers for the microservices.
  • Usually, we will have one container (microservice) per Pod (VM), but if the master application needs a helper application, then there can be more than one container in a single Pod.
  • Container – microservices

Kubeproxy – Virtual network for connecting pods internally

  • The virtual network assigns an IP (dynamic) address to the Pod and communicates internally using the internal network.
  • If a container inside the Pod dies, a new Pod is created and is assigned a new IP address. So, if there is an application talking to a DB Pod, the app needs to change the IP address,?which is complex and not a cloud native solution.
  • Hence, Service is created for each Pod, and the application refers the Pods using the Service name without referring to the underlying dynamic IP address.

Kubernetes (K8s) Best Practices

Kubernetes Lifecycle Management

  • K8s life cycle management as upgrades or enhancements are cumbersome, if you have built your own cluster on bare-metals/VMs.
  • It is easier to build a new cluster with the latest version versus transitioning workload from old to the new versus in-place node upgrades.
  • K8s has lots of moving parts that need to align with upgrades.
  • There are services like Kops, Kubespray, and Kubewas that makes it easier to upgrade, but they all have shortcomings.
  • E.g., if you have built your K8s cluster using Kubespray on RHEL VMs, it has a playbook for building, adding & removing new nodes. ?
  • No minor versions can be skipped during upgrades, i.e., one has to go through all the interim versions to reach the target version.
  • That is why, migrating K8s deployment from on-premises to cloud native solution like EKS, GKE is better where the cloud hyperscaler manages all the heavy lifting like lifecycle management and you can focus on the applications and their deployment rather than the infrastructure/platform around the K8s cluster.

Java Container Compatibility

  • K8s compatibility with Java containers (microservices) have improved over the years.
  • Previous version of Java often struggles to understand the container environment like docker and crashed from heap memory issues and unusual garbage collection trends.
  • This was due to JVM’s inability to talk to Linux cgroup and namespace, which are the core container technology.
  • However, Java’s subsequent patches use flags to tackle these problems.
  • Java still has a bad reputation for hogging memory and slower startups compared to its peers like Python & Go.
  • Hence, if we need to choose Java, it should always be the latest version, and also, K8s’s internal memory limit area should be set to 1 GB on top of JVM max heap memory; i.e., if JVM heap memory uses 8GB, the K8s resource limit for the application would be set to 9GB.

Liveliness and Readiness Probe

  • They are excellent features to combat system problems autonomously.
  • They can restart the containers (microservices) on failure and divert traffic away from unhealthy instances.
  • But in certain cases of applications startup and recovery, especially for stateful applications like messaging platform (Kafka) and database (MongoDB), it can be problematic.
  • E.g., For Apache Kafka, we had run a 3 broker, 3 Zookeeper stateful set with a replication factor of 3 and a minInSyncReplica of 2.
  • This issue occurs when Kafka is started up after an accidental failure or system crash.
  • This causes it to run some additional scripts to fix corrupt indexes taking 15-25 minutes to startup.
  • The liveliness probe would constantly fail, issuing a kill signal to Kafka issue since initialDelaysSeconds was lower. This prevents Kafka from ever fixing up the indexes problem during startup.
  • It was difficult to put a number to this parameter since some recoveries may take a longer time, so if we increase, it will slow down the K8s’s cluster performance to restart the container (microservices) during startup failure.
  • So, we need to assess the value such that it balances between the resilience we seek in K8s and time taken by the app to successfully start up in all the faulty conditions (disk failure, system crashes, n/w failure, app issues, etc.).
  • K8s has introduced a new probe ?type called “Startup Probe” to tackle this problem from version 1.16 onwards.
  • It disables readiness and liveliness checks until the container has started up.

However, if you are on the cloud and using K8s as a managed service (PaaS), it can relieve you from most of the overhead that comes with the platform maintenance.

Remember, technology for the sake of technology is meaningless.

Liveliness Probe

  • The Kubelet uses liveliness probe to know when to restart a container. The liveliness probe can detect a deadlock/hang state where the app is running but unable to make progress.
  • Restarting a container (microservice/app) in such a state can help the application to be ?more available.

Readiness Probe

  • The Kubelet uses readiness probe to know when a container is ready to accept traffic. A Pod is considered ready when all of its containers are ready.
  • It is also used to control which Pods are used as backend for services.
  • When a Pod is not ready, it is removed from Service load balancers.
  • Sometimes, apps are temporarily unable to serve traffic.
  • E.g., an app may need to load large amount of data or some configuration file or indexing or depends on external service startup; in such cases, if you don’t want to kill the app but you don’t want to send traffic either, readiness probe through K8s services does not send traffic to a Pod with apps if they are not ready.
  • Readiness probe runs on the container during its whole lifecycle.

Startup Mode

  • The Kubelet uses the Startup mode to know when a container app has started. It disables both liveliness and readiness probes until the application has started making sure those probes don’t interfere with application startup.
  • This can be adapted to use liveliness check on slow startup containers, avoiding them to get killed by the Kubelet before they are up and running.
  • Many applications running for longer periods of time eventually transition to broken states and cannot be recovered except being restarted. K8s provides liveliness probe to detect and remedy such situations.

Some Important Parameters During Probes (very useful and highly recommended)

Probes are defined inside the container’s definition in the?.yaml?resource file used to create the pod:

apiVersion: v1

kind: Pod

metadata:

(...)

spec:

containers:

- name: my-awesome-container

image: some-awesome-image

(...)

<probe-type>:

<probe-action>: (...)

initialDelaysSeconds: 5

periodSeconds: 10

timeoutSeconds: 3

successThreshold: 2

failureThreshold: 4

  • The first one is the initial delay (initialDelaysSeconds), which defines the amount of time in seconds to wait before executing the probe for the first time.
  • This interval is particularly good when we have applications with lots of dependencies and/or long loading time.
  • If this property is not set, the probe will be executed as soon as the container is loaded.
  • If the command succeeds, it returns ZERO and the Kubelet considers the container to be live & healthy.
  • If the command returns a non-ZERO, the Kubelet kills the container app and restarts it.
  • After this initial delay, the probe is executed and?Kubelet waits for a certain amount of time for a result before assuming a failure for timeout (timeoutSeconds).
  • Keep in mind that if you have a short timeout, you may have false results just because your application had not enough time to process the action.
  • The default value is 1 second, which is enough for most situations, but I highly recommend defining a realistic value for each application.
  • If the probe fails,?Kubelet?tries again as many times as set in the?failureThreshold?property; this way any temporary unavailability won’t cause the probe to put the container in a failed state.
  • Only after the last consecutive failed attempt will the container be considered failed. If not set, the default number of attempts is 3.
  • In some situations, a single successful result may not be enough to really ensure the health of our container.
  • In this case, the?successThreshold?property sets how many consecutive times the action needs to be successful to change the state of a container from failed to successful.
  • We can also set the interval between one execution and another, in seconds, using the?periodSeconds?property. If not set, the probe will be executed every 10 seconds.


?Protect a Slow Starting Container With Startup Mode

  • Sometimes, you have to deal with app that might require an additional startup time and hence it can be tricky to setup a liveliness probe.
  • The trick is to set up a startup mode with the same command, with failureThreshold X periodSeconds long enough to cover the worst case of startup time.
  • Say, if failureThreshold : 30 * periodSeconds : 10, the container app will have 300 seconds to start up.
  • If the startup mode never succeeds, the container is killed after 300 seconds and subjected to Pods restartPolicy.

General Overview

  • For people that likes diagrams (like myself), here’s one to show the execution flow of probes in a Kubernetes cluster.


?Conclusion

  • Probes are an incredibly easy-to-set and useful native tool in Kubernetes clusters. However, as you brave reader who reached this point may have seen, some situations and caveats need to be taken into consideration when developing and setting them up.

要查看或添加评论,请登录

Kingshuk Biswas - Agentic AI and Gen AI for Building Business Apps的更多文章

社区洞察

其他会员也浏览了