Kubernetes Custom Controllers part-1
#kubernetes #k8s #controller #custom-controller #container #golang #channel #informers #workqueue

Kubernetes Custom Controllers part-1

What is a Controller? And what is a custom controller?

Controller is an application or feature in k8s that looks up for the desired resources and currently present resources count. And based on that difference it will perform some of the operations like creating one or more resources,?removing one or more resources, or updating some resources.

Or we can say that controllers are control loops that watch the state of your?cluster, then make or request changes where needed. Each controller tries to move the current cluster state closer to the desired state.

Controller will run in a loop that continuously monitor some of the resources (for which it is intended to) and perform operations. In the k8s we have kube-control manager which include multiple in-built controllers that monitor and perform operations on different and specific types of resources.

Some of the examples of types of controllers present in k8s internal kube-control manager are:

  1. Replication controller
  2. Endpoints controller
  3. Namespace controller
  4. Service Account controller


Custom controller are the extension to the concept of controller in k8s where we define our own logic of control loop or pattern to watch the state of that particular resource in the cluster, then make or request changes wherever needed.

Simply terms, when a controller is not part or shipped with the default k8s then we can call it as a custom controller.


Why do we need a customer controller?

Usually, the custom controllers are effective in use when we need to watch the custom resources but it can work with any kind of resources depending on our requirements.

With use of custom controllers we can also encode domain knowledge for our specific native applications into an extension of the Kubernetes API as well.


Core Internal Components of a Controller

  1. Informers
  2. Work Queue

There are two main components of the k8s controller: 1) Informer 2) Workqueue. But before we talk about the internals of the kube-control-manager, first we need to understand the controller workflow in the simple terms.

No alt text provided for this image
Controller request flow

The key working feature of a k8s control manager is that, it navigates through the help of events distribution for which changes that controller has registered. So a controller is registered to watch the events for changes on that specific resource and perform some operation based on the logic we have designed for this controller.


  • ?These registration of the events for a specific resource type is done with the?help of a very important component of controller called Informer.
  • And the action as per the logic that we need to perform based on those events are popped out by the Workqueue to worker(s) to processing.

No alt text provided for this image
custom controller internal structure

Let’s first take a look at logical example of how a controller logic looks like from a code perspective.

#ForExample: We want to list all of the Pods in a particular Namespace and print message for the name of Pod been added into a particular namespace. So what we do is:

  1. List for the resource #pod continuously in a particular namespace
  2. And then watch / monitor on that list for any new pod, and if that pod is new then print the message for it.

The above example is explanation of controller workflow in lame terms using the kubernetes basic admin command.

In the above example, at the first step of listing the resources is a very expensive operation from the API server stand point of view because it will put a heavy load onto the API server causing other application work to get slowed down.

Hence, to resolve that issue we make use of concept of internal caching up the data. And to implement and make use of cache logic we have informers.


#Informers

Informers is a very important part of k8s controller model which help us to maintains an internal cache that allows the controller to be notified only when the state of a resource changes, so that the controller can execute instructions reducing the load on the kube-apiserver.

Usually we use informer in such a way that for each type of resource we have a separate cache, but if we need to monitor or watch multiple resources in the cluster then having multiple cache would leads to wastage of resources.

Hence, to reduce the over consumption of the resources we must (recommended when we need to watch multiple resources) use SharedInformer instead of a simple informer.



#WorkQueue

Workqueue is an another important part of the k8s controller model especially when we are using the SharedInformer. Because SharedInformer has a drawback that, it doesn’t provide a facility to keep track of where each controller is (because of it's shared cache), so controller must provide its own queuing and retrying mechanism (if required) to keep track of progress of controller.

And the reason why we don’t have such feature in SharedInformer is because of the shared cache for all of the resources to avoid any duplication is cache leading to more and unnecessary consumption of the resources.

There are different types of queues that are supported as WorkQueue:

  1. The rate-limiting queue
  2. The delayed queue
  3. The timed queue.



When an object or resource for which we are monitoring is changes, the Resource Event Handler puts a key into the Workqueue. It will then be picked up by SharedInformer to distribute the cluster changes to controllers.?

Keys that we put into the Workqueue have the following format,?resource_namespace/<resource_name. If no namespace,?resource_namespace, is provided, then the key will simply be?resource_name



Golang coding concepts

Let’s start with some of the important coding concepts of Golang for writing custom controller we need to understand first before jumping on to code of custom controller.


Here I am assuming that you are new to Golang and don’t know much about the golang concepts.

However, If you have an idea about these concept of go-routines & concurrency, then you can skip this section and directly jump onto the code section for writing custom controllers for k8s.


?Concepts of golang required for writing custom controller.

  1. Goroutines
  2. Channels
  3. Defer Keyword

#goroutine

A goroutine is a function that executes independently in a concurrent fashion. In simple terms, it's a lightweight thread that’s managed by go runtime.?

You can create a goroutine by using a?Go?keyword before a function call. Let’s say we have a function that prints string taken as input in golang

func printString(s string) {

	fmt.Println(s)

}        

Now, we can run/execute this function run into goroutine simply by calling the function as below.


go printString(“Hello”)        



Let me take an another example to understand in detail about the working on a goroutine and the concurrency.

Let’s take the below code snippet for running a goroutine.



package main

import (
	"fmt"
)

func printString(s string){

	for I=0;i<5;i++ {

		fmt.Println(s)

	}

}


func main(){

	go printString(“Hello”)

	printString(“World”)

}
        

  • Output of the above code is:


World

World

World

World

World        



But if we run the both of two function normally by simply calling them instead of goroutine, the output will looks like this.


Hello

Hello

Hello

Hello

Hello

World

World?

World

World

World        



Similarly, if we execute both of the function calling as a goroutine like below.



func main() {

	go printString(“Hello”)

	go printString(“World")

}        

  • The output now will be empty, as there is no string get’s printed onto the terminal.

And the reason why we didn’t get any output in previous examples is because goroutine runs the function independently and by the time that child goroutine execute/runs the main function already get’s executed and terminated.

But if you want to manage the execution order then we have multiple ways.

  1. Using the time.sleep function to wait for one of the goroutine to finish execution
  2. Using WaitGroups module provided by the sync module in golang
  3. Using channels



#defer

Now let’s talk about the defer keyword because this keyword explanation is not so good to understand who is a beginner.

defer keyword simply means that that particular statement will be placed at the end of the scope of the function where we used it.

To use, we simply put the defer keyword in front of the statement that we would like to be executed by the end of that function scope


func main(){

  defer fmt.Println("Hello World")


  fmt.Println("Started main")
 
}
        

  • The output of the above code will be like this:


Started main

Hello World        



#Channels

Now let’s talk a little bit about channels, a very important concept of golang concurrency, used in controllers as a logic where the controller wait for event updates.

So, channels in technical terms is a data structure or?even a design pattern that acts as a medium through which multiple goroutines can communicate with each other. In simple terms, a channel is a pipe that allows a goroutine to either write or read the data.?

To declare and use the channel for communication purpose, we use the make keyword in the following syntax command.

  • Before making the channel first clarify what type of data you would like to send through the channel


c := make(chan <data-type>)        

Next we would see how to read & write data through channels, but before that let's see what exactly the channel is. To understand that first declare the channel only using the var keyword.


var ch chan <data-type


fmt.Printf(“Type of channel:? %T”, ch)

fmt.Printf(“Value of channel:? %n”, ch)>        

  • The output of the above code will be as below:


Type of channel: chan int

Value of channel:? 0xc000022120? <—--        

In the above output for value of channel we see it’s a memory address. So, channel is nothing but a pointer.

Now let’s see how to read and write data through the channel. To remember this syntax properly, we can think channel as some kind of box and arrow will tell whether we are putting data into the box or reading the data from the box.

  • To read the channel (or get something out of box) use the below syntax.


<-ch        

  • To write into the channel (or putting something in the box) use the below syntax.


ch <- 10        

By default the channels in golang are bidirectional, means we can use the channel to either write or read data. But in some situations this will make confusion and not a good practices. Hence, instead we explicitly mention that for what purpose we are using the channel i.e either to read or write something in.

And to make the read only channel use the below syntax.


r_ch := make(<-chan <data-type>)? <—This is read only channel

w_ch := make(chan<-? <data-type>)? <— This is write only channel        

In golang we have two type of channels, 1) Buffered channel 2) Unbuffered channel. The examples we have used so far are examples of unbuffered channels in which we can send / write only one data whereas in buffered we can send multiple data.

If you would like study more about the both of the channels please refer to these reference links. However, in this blog for controllers we only need to understand unbuffered channels only.

Now, we know how to define channels and how to read & write from channel. Next we need to understand the working on an unbuffered channel.

So, as we mentioned above that unbuffered channels only send one data of any type (that we defined at the time declaration) and with that information the generic flow would be as:

  1. Sender will blocks the code until the receiver is ready and vice-versa
  2. The Send will happens before the receive as you can’t receive that hasn’t been sent
  3. The receiver will always return before the send
  4. Sender & receiver are synchronised

For the point-2 in above steps, if the receiver comes first then it knows that the data has been sent already. If it’s not sent, then it will throw an deadlock error.

In simple terms, unbuffered channel is similar to courier delivery system where the delivery boy delivers the courier and once it’s delivered both of the parties go in their own way. And whoever comes first, must have to wait for the other before moving on to their next step.

Refer to the below image to understand diagrammatically working on unbuffered channels.

No alt text provided for this image
Unbuffered channel in golang

Let’s take a small code example to see how the channel workflow to understand it from code perspective. Below is the code example I am taking up.


func multiplytwo(c chan int) {

	fmt.Println("Waiting for data from channel")

	temp := <-c // Receiving End of the unbuffered channel c

	fmt.Println("Recieved data from channel")

	fmt.Println(temp * 2)

}



func main() {

	ch := make(chan int)

	defer close(ch)


	fmt.Println("Starting a child goroutine")

	go multiplytwo(ch)


	fmt.Println("Sending data over the channel")

	ch <- 2 // Sending data over channel

	fmt.Println("Main ended")

}        

  • Now, let’s see how the code will proceed.

  1. First the main function will be called when the program is executed and it makes the channel
  2. With defer keyword, the close(ch) part of code will be executed in the end
  3. Then it will print “Starting a child goroutine” message and starts a another goroutine multipletwo?
  4. And after creating a goroutine, the main function will move forward and print “Sending data over channel” message.

This happens because at this point there is no hold onto main function and goroutine hasn’t started yet.

  1. Now we have the statement of sending data over channel. And here the channel will block the further execution of code from main goroutine until the receiver didn’t finish/return
  2. Now the child goroutine will print “Waiting for data from channel”? because the main goroutine was put on hold due to send channel
  3. Then the goroutine will send the data over channel and return. But note here in the above image that there is a gap of time between the sender receive an acknowledgement of the received.
  4. Because of this time gap, the child goroutine will go ahead and print message “Received data” and value of temp*2
  5. Once the send received the acknowledgement, our main goroutine will continue/resumed and print message “Main ended”

And the output of the above code will looks like this.


$ go run channel.go

Starting a child goroutine

Sending data over the channel

Waiting for data from channel

Recieved data from channel

4

Main ended        



In the next part of this blog, I will talk about how to coding section of writing a simple k8s custom controller where I will be talking about the simple use case of custom controller.

Thank you



Reference Links:

Joel Kambella Mutua

Cloud & DevOps Engineer / AWS / Kubernetes / Azure / Linux / Containers / Python / SRE

1 年

This is quite good, thank you for sharing

Tauqeer Ahmad

(Open for DevOps Role) | Developer Relations Engg @TezosIndia (On Notice Period)| Gold Microsoft Learn Student Ambassador | Technical Content Writer | Building Tezos in India |

2 年

I see, it's written wonderfully.

要查看或添加评论,请登录

Kritik Sachdeva的更多文章

  • Cephadm | Part-1

    Cephadm | Part-1

    In this small article, I will be covering a one of most of important component and critical component of ceph "cephadm"…

    3 条评论
  • Kubernetes Custom Controller Part-2

    Kubernetes Custom Controller Part-2

    This is a second blog of a two part series for the custom k8s controller. If you have no idea or knowledge about the…

  • Deep dive into Ceph Scrubbing

    Deep dive into Ceph Scrubbing

    Definition Scrubbing is a mechanism in Ceph to maintain data integrity, similar to fsck in the file system, that will…

    13 条评论
  • "Ceph" a new era in the Storage world Part-1

    "Ceph" a new era in the Storage world Part-1

    Ceph is a Software Defined Storage solution created by Sage Weil in 2003 as part of his Ph.D project in California.

    6 条评论
  • Integration of Iscsi Storage with openshift | Part-2

    Integration of Iscsi Storage with openshift | Part-2

    Hi guys, in this post I will cover the iscsi server setup and how to automate it with ansible and use cases of…

    1 条评论
  • Configure HA Cluster on top of AWS using terraform and ansible

    Configure HA Cluster on top of AWS using terraform and ansible

    In this post, I will cover only the setting up of the HA cluster, not the resource creation part. So let's start with…

  • Storage Integration with Openshift

    Storage Integration with Openshift

    In this post, I am gonna show you how to integrate some of the storage solutions with openshift. And for the…

  • Podman vs Docker deploying a WordPress application

    Podman vs Docker deploying a WordPress application

    Container technology is the most popular tool used widely for the faster deployment of the application on to servers…

  • Amazon Kubernetes as a Service

    Amazon Kubernetes as a Service

    Why we need Aws to run our Kubernetes? What is Kubernetes? and so on..

    2 条评论
  • Basic Example to explain the Workflow of DevOps using webserver

    Basic Example to explain the Workflow of DevOps using webserver

    In companies, three different teams are working in parallel to push the application into the production world, so that…

社区洞察

其他会员也浏览了