A Production Issue that's Hard to Find
Mutha Nagavamsi
Kubernetes, Devops, Cloud & Tech. I run a supercool k8s community, do join. 75K+ strong all socials ??
Regardless of where you are running your app (K8s & non-K8s environment), sometimes random people start to complain by saying: "Your app doesn't load!"
Naturally, these issues are overwhelming. And scary too. Because they impact the business.
What's interesting is that, your app loads fine on your device. On all of your devices. And your monitoring systems look quite normal. Except that your customer still has the issue.
NOTE: If you resonate with my work, please consider joining me on Youtube. I totally appreciate your support. Thank You.
Let's continue...
So obviously, you respond to your customer support by saying, "customer internet issue".
But customer responds back - "My other apps are working fine".
Now that's when the actual trouble starts.
In scenarios like these, in most cases, it is ISP's local DNS server issue or a caching problem. (I've seen it few times. In some of the apps I manage for my org!)
领英推荐
If your application's IP address has recently changed, these cached entries take time to reflect. Most often, you are unaware of these changes.
Both these issues are absolutely out of your control. Finding them is hard as well.
How do you fix the problem? 2 Ways.
And that's why my friend, thinking out of the box matters.
Hope you learned something today. The purpose of learning is growth, not grades. Thank you.
Btw, if you are interested in my work, consider checking out my Twitter and Substack newsletter too. It helps me.
Here are some of my older newsletters.
Tech Lead | DevOps | AIOps| Azure | K8s | Terraform | GenAI
6 个月In my case, we use azure application gateway with static ip address. We create DNS record for the application pointing to the app gateway ip. I never seen any issue with DNS as the gateway ip never changed. But good to know about this in case if we recreate the app gateway we will get different ip and end up with this issue. Thanks for sharing Mutha Nagavamsi ??
Cloud Specialist DevOps at Niveus Solutions Pvt. Ltd.
6 个月Thanks for sharing such informative stuff and the way you simplified it..!!
Technologist & Believer in Systems for People and People for Systems
6 个月Thanks for the simple walkthrough of concepts with scenarios for the good ??
Open Source Developer | Contributor @glasskube @buildsafe @cyclops
6 个月Yup, I recently came across a podcast with a Engineering Director of the JIO Cinema. There their clients faced the issue of app crash. Later they caught that it was a DNS issue, and they fixed it via routing the DNS.
Follow for Your Daily Dose of Coding, Software Development & System Design Tips | Tech Book Buff | Exploring AI | Everything I write reflects my personal thoughts and has nothing to do with my employer. ??
6 个月Interesting, loved the way you simplified it. But I think there could be other reasons as well that can cause such issues for the Customer. The problem could lie with the user's device, other network configuration. Just thinking, it could be that CDN's edge servers having regional outages. ?? But yes, such scenarios are scary.