Kubernetes Custom Controllers part-1
Kritik Sachdeva
Technical Support Professional at IBM | RHCA-XII | Openshift | Ceph |Satellite| 3Scale | Gluster | Ansible | Red Hatter
What is a Controller? And what is a custom controller?
Controller is an application or feature in k8s that looks up for the desired resources and currently present resources count. And based on that difference it will perform some of the operations like creating one or more resources,?removing one or more resources, or updating some resources.
Or we can say that controllers are control loops that watch the state of your?cluster, then make or request changes where needed. Each controller tries to move the current cluster state closer to the desired state.
Controller will run in a loop that continuously monitor some of the resources (for which it is intended to) and perform operations. In the k8s we have kube-control manager which include multiple in-built controllers that monitor and perform operations on different and specific types of resources.
Some of the examples of types of controllers present in k8s internal kube-control manager are:
Custom controller are the extension to the concept of controller in k8s where we define our own logic of control loop or pattern to watch the state of that particular resource in the cluster, then make or request changes wherever needed.
Simply terms, when a controller is not part or shipped with the default k8s then we can call it as a custom controller.
Why do we need a customer controller?
Usually, the custom controllers are effective in use when we need to watch the custom resources but it can work with any kind of resources depending on our requirements.
With use of custom controllers we can also encode domain knowledge for our specific native applications into an extension of the Kubernetes API as well.
Core Internal Components of a Controller
There are two main components of the k8s controller: 1) Informer 2) Workqueue. But before we talk about the internals of the kube-control-manager, first we need to understand the controller workflow in the simple terms.
The key working feature of a k8s control manager is that, it navigates through the help of events distribution for which changes that controller has registered. So a controller is registered to watch the events for changes on that specific resource and perform some operation based on the logic we have designed for this controller.
Let’s first take a look at logical example of how a controller logic looks like from a code perspective.
#ForExample: We want to list all of the Pods in a particular Namespace and print message for the name of Pod been added into a particular namespace. So what we do is:
The above example is explanation of controller workflow in lame terms using the kubernetes basic admin command.
In the above example, at the first step of listing the resources is a very expensive operation from the API server stand point of view because it will put a heavy load onto the API server causing other application work to get slowed down.
Hence, to resolve that issue we make use of concept of internal caching up the data. And to implement and make use of cache logic we have informers.
Informers is a very important part of k8s controller model which help us to maintains an internal cache that allows the controller to be notified only when the state of a resource changes, so that the controller can execute instructions reducing the load on the kube-apiserver.
Usually we use informer in such a way that for each type of resource we have a separate cache, but if we need to monitor or watch multiple resources in the cluster then having multiple cache would leads to wastage of resources.
Hence, to reduce the over consumption of the resources we must (recommended when we need to watch multiple resources) use SharedInformer instead of a simple informer.
Workqueue is an another important part of the k8s controller model especially when we are using the SharedInformer. Because SharedInformer has a drawback that, it doesn’t provide a facility to keep track of where each controller is (because of it's shared cache), so controller must provide its own queuing and retrying mechanism (if required) to keep track of progress of controller.
And the reason why we don’t have such feature in SharedInformer is because of the shared cache for all of the resources to avoid any duplication is cache leading to more and unnecessary consumption of the resources.
There are different types of queues that are supported as WorkQueue:
When an object or resource for which we are monitoring is changes, the Resource Event Handler puts a key into the Workqueue. It will then be picked up by SharedInformer to distribute the cluster changes to controllers.?
Keys that we put into the Workqueue have the following format,?resource_namespace/<resource_name. If no namespace,?resource_namespace, is provided, then the key will simply be?resource_name
Golang coding concepts
Let’s start with some of the important coding concepts of Golang for writing custom controller we need to understand first before jumping on to code of custom controller.
Here I am assuming that you are new to Golang and don’t know much about the golang concepts.
However, If you have an idea about these concept of go-routines & concurrency, then you can skip this section and directly jump onto the code section for writing custom controllers for k8s.
?Concepts of golang required for writing custom controller.
A goroutine is a function that executes independently in a concurrent fashion. In simple terms, it's a lightweight thread that’s managed by go runtime.?
You can create a goroutine by using a?Go?keyword before a function call. Let’s say we have a function that prints string taken as input in golang
func printString(s string) {
fmt.Println(s)
}
Now, we can run/execute this function run into goroutine simply by calling the function as below.
go printString(“Hello”)
Let me take an another example to understand in detail about the working on a goroutine and the concurrency.
Let’s take the below code snippet for running a goroutine.
package main
import (
"fmt"
)
func printString(s string){
for I=0;i<5;i++ {
fmt.Println(s)
}
}
func main(){
go printString(“Hello”)
printString(“World”)
}
领英推荐
World
World
World
World
World
But if we run the both of two function normally by simply calling them instead of goroutine, the output will looks like this.
Hello
Hello
Hello
Hello
Hello
World
World?
World
World
World
Similarly, if we execute both of the function calling as a goroutine like below.
func main() {
go printString(“Hello”)
go printString(“World")
}
And the reason why we didn’t get any output in previous examples is because goroutine runs the function independently and by the time that child goroutine execute/runs the main function already get’s executed and terminated.
But if you want to manage the execution order then we have multiple ways.
Now let’s talk about the defer keyword because this keyword explanation is not so good to understand who is a beginner.
defer keyword simply means that that particular statement will be placed at the end of the scope of the function where we used it.
To use, we simply put the defer keyword in front of the statement that we would like to be executed by the end of that function scope
func main(){
defer fmt.Println("Hello World")
fmt.Println("Started main")
}
Started main
Hello World
Now let’s talk a little bit about channels, a very important concept of golang concurrency, used in controllers as a logic where the controller wait for event updates.
So, channels in technical terms is a data structure or?even a design pattern that acts as a medium through which multiple goroutines can communicate with each other. In simple terms, a channel is a pipe that allows a goroutine to either write or read the data.?
To declare and use the channel for communication purpose, we use the make keyword in the following syntax command.
c := make(chan <data-type>)
Next we would see how to read & write data through channels, but before that let's see what exactly the channel is. To understand that first declare the channel only using the var keyword.
var ch chan <data-type
fmt.Printf(“Type of channel:? %T”, ch)
fmt.Printf(“Value of channel:? %n”, ch)>
Type of channel: chan int
Value of channel:? 0xc000022120? <—--
In the above output for value of channel we see it’s a memory address. So, channel is nothing but a pointer.
Now let’s see how to read and write data through the channel. To remember this syntax properly, we can think channel as some kind of box and arrow will tell whether we are putting data into the box or reading the data from the box.
<-ch
ch <- 10
By default the channels in golang are bidirectional, means we can use the channel to either write or read data. But in some situations this will make confusion and not a good practices. Hence, instead we explicitly mention that for what purpose we are using the channel i.e either to read or write something in.
And to make the read only channel use the below syntax.
r_ch := make(<-chan <data-type>)? <—This is read only channel
w_ch := make(chan<-? <data-type>)? <— This is write only channel
In golang we have two type of channels, 1) Buffered channel 2) Unbuffered channel. The examples we have used so far are examples of unbuffered channels in which we can send / write only one data whereas in buffered we can send multiple data.
If you would like study more about the both of the channels please refer to these reference links. However, in this blog for controllers we only need to understand unbuffered channels only.
Now, we know how to define channels and how to read & write from channel. Next we need to understand the working on an unbuffered channel.
So, as we mentioned above that unbuffered channels only send one data of any type (that we defined at the time declaration) and with that information the generic flow would be as:
For the point-2 in above steps, if the receiver comes first then it knows that the data has been sent already. If it’s not sent, then it will throw an deadlock error.
In simple terms, unbuffered channel is similar to courier delivery system where the delivery boy delivers the courier and once it’s delivered both of the parties go in their own way. And whoever comes first, must have to wait for the other before moving on to their next step.
Refer to the below image to understand diagrammatically working on unbuffered channels.
Let’s take a small code example to see how the channel workflow to understand it from code perspective. Below is the code example I am taking up.
func multiplytwo(c chan int) {
fmt.Println("Waiting for data from channel")
temp := <-c // Receiving End of the unbuffered channel c
fmt.Println("Recieved data from channel")
fmt.Println(temp * 2)
}
func main() {
ch := make(chan int)
defer close(ch)
fmt.Println("Starting a child goroutine")
go multiplytwo(ch)
fmt.Println("Sending data over the channel")
ch <- 2 // Sending data over channel
fmt.Println("Main ended")
}
This happens because at this point there is no hold onto main function and goroutine hasn’t started yet.
And the output of the above code will looks like this.
$ go run channel.go
Starting a child goroutine
Sending data over the channel
Waiting for data from channel
Recieved data from channel
4
Main ended
In the next part of this blog, I will talk about how to coding section of writing a simple k8s custom controller where I will be talking about the simple use case of custom controller.
Thank you
Reference Links:
Cloud & DevOps Engineer / AWS / Kubernetes / Azure / Linux / Containers / Python / SRE
1 年This is quite good, thank you for sharing
(Open for DevOps Role) | Developer Relations Engg @TezosIndia (On Notice Period)| Gold Microsoft Learn Student Ambassador | Technical Content Writer | Building Tezos in India |
2 年I see, it's written wonderfully.