Kubernetes Architecture: A Deep Dive
Introduction
Kubernetes has become the backbone of modern cloud-native applications, thanks to its flexible, scalable, and resilient architecture. Two key design principles power Kubernetes: microservices-based architecture and event-driven design. Together, these principles enable Kubernetes to manage complex, distributed workloads across diverse environments, delivering on its promises of scalability, resilience, and efficiency.
Microservices-Based Architecture
Kubernetes is composed of loosely coupled components, each responsible for specific tasks within the cluster. This modular design allows critical components—such as the API server, etcd, scheduler, controller manager, and kubelet—to operate independently.
Key Advantages:
Key Components:
The diagram from Kubenettes official website provides a high-level overview of the essential components that make up a Kubernetes cluster.
Control Plane Components:
Manage the overall state of the cluster:
Node Components
Run on every node, maintaining running pods and providing the Kubernetes runtime environment:
Add-Ons
Kubernetes' modular architecture allows for seamless integration with various add-ons, extending its functionality and enhancing its capabilities.
Examples
Event-Driven Design
While Kubernetes is a powerful container orchestration platform that facilitates the deployment of microservices architectures, its event-driven reconciliation loop ensures efficient communication between components. Rather than interacting directly, Kubernetes components produce and consume events through the API server, which acts as a central broker.
Unlike the traditional orchestrator pattern—where a central controller manages the entire workflow by assigning tasks to each component—Kubernetes follows a choreography pattern . This approach is like a dance where each dancer (component) knows its steps and listens for cues (events) from the music (the desired state). Each component reacts independently, without needing instructions from a central controller.
In Kubernetes, the API server operates as a message broker: it receives requests through its HTTP CRUD API, updates the cluster’s data store, and enables each controller (such as the Scheduler, Kubelet, Deployment Controller, Job Controller etc.) to run independently in its own control loop. These controllers use the watch API to monitor relevant events, perform specific actions (e.g., assigning a Pod to a node or starting a container), and then update the cluster’s state through the API server. This event-driven, loosely coupled design allows Kubernetes to manage workflows efficiently while ensuring resilience and scalability.
This asynchronous, event-driven flow enables components to operate independently, react to changes, and scale efficiently, ensuring that Kubernetes remains responsive and resilient even under heavy load. Features like horizontal scaling and self-healing further enhance its robustness.
Key Features:
The List-Watch Mechanism
Kubernetes optimizes event handling through the List-Watch pattern:
For a deep dive into Kubernetes' event system, refer to Michael Gasch's excellent article "Events, the DNA of Kubernetes ".
Application Deployment
When deploying an application using kubectl apply or helm apply, multiple services in Kubernetes coordinate to handle the request.
Stateful vs. Stateless Applications
Before diving into the deployment process, it's essential to understand the difference between stateful and stateless applications.
Note: For large databases, Kubernetes native primitives (e.g., StatefulSets) may not provide sufficient management capabilities. In such cases, consider using the Operator pattern for advanced management, such as automated backup and restore, and high availability. With the Operator pattern, you can encode domain knowledge of specific applications into a Kubernetes API extension. Using this, you can create, access, and manage applications with kubectl, just as you do for built-in resources like Pods.
For example, Kafka can be deployed as an Operator instead of a StatefulSet application, popular operators include Confluent Kafka Operator and Strimzi Kafka Operator .
To learn more about the Operator pattern, see my recent article: "Kubernetes Operator Explained"
StatefulSet Deployment Process
A StatefulSet runs a group of Pods and maintains a sticky identity for each of those Pods.
领英推荐
StatefulSet Example Configuration
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: kafka
spec:
serviceName: kafka
replicas: 3
selector:
matchLabels:
app: kafka
template:
metadata:
labels:
app: kafka
spec:
containers:
- name: kafka
image: confluentinc/cp-kafka:latest
volumeMounts:
- name: kafka-logs
mountPath: /var/lib/kafka
volumeClaimTemplates:
- metadata:
name: kafka-logs
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
Key Components
Here’s a breakdown of the process when you run kubectl apply on a statefulSet manifest above:
1. Stateful Request: kubectl apply or helm apply sends a request to the Kubernetes API server.
2. Process Request: The API server processes and validates the request.
3. Desired State Storage: The API server updates the Kubernetes data store (etcd) with the new desired state.
4. StatefulSet Controller: The StatefulSet controller creates necessary pod objects(pending state) and corresponding Headless Service.
5. Pod Scheduling: The Kubernetes scheduler assigns pods to nodes based on resource availability.
6. Node Selection: The Scheduler assigns pods to appropriate nodes.
7. Creating PVCs(optional): Kubelet interacts with storage provider to create PersistentVolumeClaims (PVCs).
8. Image Pulling: The Kubelet instructs the container runtime to pull the necessary container image.
9. Container Startup: The container runtime starts the container as instructed by the Kubelet.
10. Updating API Server: Kubelet reports pod status to API server, updating data store.
11. Pod Identity and Networking: Each pod receives a unique identity (hostname) based on its ordinal index.
General Deployment Request (Stateless)
Here’s a step-by-step breakdown of a general deployment request for a stateless application:
Understanding the API Server’s Role in Kubernetes’ Event-Driven Architecture
The API server is central to Kubernetes’ event-driven design, acting as a stateless service that processes client requests and distributes events to components like the scheduler and kubelet. It bridges the gap between the cluster's desired and actual states.
Key Features:
Example: Horizontal Pod Autoscaler (HPA)
The Horizontal Pod Autoscaler (HPA) is a prime example of Kubernetes’ event-driven architecture. It consumes metrics from the cluster (e.g., CPU usage) and triggers scaling events to adjust the number of Pods based on demand.
Key Benefits:
For a deeper understanding of the API server's implementation, refer to Stefan Schimanski and Michael Hausenblas 's comprehensive series.
Managed Kubernetes Services
Kubernetes' loosely coupled architecture enables cloud providers like AWS, Google Cloud, and Azure to offer managed Kubernetes services (e.g., Amazon EKS, Google Kubernetes Engine, Azure Kubernetes Service). In these services, the cloud provider abstracts away and manages the control plane components (API server, etcd, scheduler), allowing clients to focus on their workloads and the data plane
Conclusion
Kubernetes' architecture—combining microservices and an event-driven design—provides a flexible, scalable, and resilient platform for managing cloud-native workloads. By decoupling components and embracing asynchronous communication, Kubernetes ensures modern applications can scale and recover from failures efficiently. Additionally, managed services from cloud providers like AWS, Google Cloud, and Azure allow businesses to adopt Kubernetes without needing deep expertise in control plane management, focusing instead on deploying and scaling applications.
References: