登录查看更多内容

Issues in Kubernetes Pods post node reboot

Rajaraman Sathyamurthy

Associate Director & Senior Architect, Data Architecture

发布日期: 2023年1月30日

You may have automated patching and reboot scheduled for your VMs (as part of maintenance / patching window), using BigFix or similar tool. But if Kubernetes is running on those VMs, you may be facing issues with Kubernetes pods or nodes, which might get stuck or hung post reboot and services may not be up properly.

Sounds familiar?

VMs that are running Kubernetes are not supposed?to be rebooted this way.

Is it so? What's the right way?

Well, before abruptly rebooting the VM, the pods / services / workloads running on that node to be smoothly moved over to another node. You can achieve this by simply running the cordon command (to first stop access requests coming to that node) and drain command (to successfully move the workloads to another node that has resources).

Now you can perform the reboot of VM and once it is up, do not forget to uncordon it to allow traffic again. This way you won't have stuck/hung issues that's caused by abrupt reboot of VMs.

You can achieve this easily using Kured (Kubernetes Reboot Daemon), combining with simple shell-script automation. Hope this helps!

要查看或添加评论，请登录

Rajaraman Sathyamurthy的更多文章

Fail-safe Logstash Pipelines

2023年11月14日

Fail-safe Logstash Pipelines

This article talks about establishing fail-safe mechanism for Logstash pipelines. Logstash is an open-source data…

1 条评论
Adding Readiness & Liveness to Kubernetes Workloads (Kibana)

2023年4月10日

Adding Readiness & Liveness to Kubernetes Workloads (Kibana)

In our Kubernetes environment, we have Traefik load balancers routing the user traffic to Kibana application servers…
Business Function Health Visualization (BFHV)

2023年3月30日

Business Function Health Visualization (BFHV)

Authors (alphabetical order): 1. Anup Kumar Gupta PMP? 2.

1 条评论
Data Pipeline Monitoring

2023年1月23日

Data Pipeline Monitoring

There are many reasons, a data pipeline could break. Most of the time it is due to issues on the data source.
Setting up multiple replicas in Kubernetes

2023年1月23日

Setting up multiple replicas in Kubernetes

Our Kibana application was running as a single instance (pod) and there was no redundancy. How did we address it…
Prod Kubernetes Challenges - Single Master, Legacy version, non-prod DC

2023年1月16日

Prod Kubernetes Challenges - Single Master, Legacy version, non-prod DC

We were stuck in legacy K8 version (v1.13.
Elastic Data Lake - Cluster status RED - issues, challenges in remediation, how did we solve it?

2023年1月13日

Elastic Data Lake - Cluster status RED - issues, challenges in remediation, how did we solve it?

One of the datalake environment we built as POC, in couple of years became production datalake and hit it's capacity…

2 条评论
Elastic Data Lake - Decision to move away from Tivoli LFA

2023年1月13日

Elastic Data Lake - Decision to move away from Tivoli LFA

AIX logs from hundreds of servers, were being ingested to Elastic Datalake; to Logstash using Tivoli LogFile Agent…

1 条评论

See all articles

Issues in Kubernetes Pods post node reboot

Rajaraman Sathyamurthy

Associate Director & Senior Architect, Data Architecture

Rajaraman Sathyamurthy的更多文章

社区洞察

其他会员也浏览了

Using xSOAC to secure SOA communication

Introducing KubeVault v2022.09.22

Announcing Kubedeploy 1.1

Advanced Autoscaling in Kubernetes with KEDA

Kubernetes Autoscaling vs. Optimization: Understanding the Difference

Kubernetes Service Discovery - Learn By Example

On-Premise Kubernetes Cost Monitoring

Kubernetes Autoscaling

The Guide to Kubernetes Labels

Service Oriented Architecture

Rajaraman Sathyamurthy的更多文章

Fail-safe Logstash Pipelines

Adding Readiness & Liveness to Kubernetes Workloads (Kibana)

Business Function Health Visualization (BFHV)

Data Pipeline Monitoring

Setting up multiple replicas in Kubernetes

Prod Kubernetes Challenges - Single Master, Legacy version, non-prod DC

Elastic Data Lake - Cluster status RED - issues, challenges in remediation, how did we solve it?

Elastic Data Lake - Decision to move away from Tivoli LFA

社区洞察

其他会员也浏览了

Using xSOAC to secure SOA communication

Introducing KubeVault v2022.09.22

Announcing Kubedeploy 1.1

Advanced Autoscaling in Kubernetes with KEDA

Kubernetes Autoscaling vs. Optimization: Understanding the Difference

Kubernetes Service Discovery - Learn By Example

On-Premise Kubernetes Cost Monitoring

Kubernetes Autoscaling

The Guide to Kubernetes Labels

Service Oriented Architecture