Will ChatGPT replace SREs?
First of all, let me level set the expectations here. I'm not an artificial intelligence (AI) scientist but a site reliability engineer (SRE). Although I've completed a course about the mathematical foundations of machine learning from Coursera by Stanford University, it doesn't make me a specialist in the cognitive domain. This post's basis is coined on what differentiates us from machines: creativity, instinct, thought processes, and mental quirks.
ChatGPT from OpenAI created an uproar on the Internet by showcasing its conversational capabilities to answer questions like any human being. Even more impressive, it demonstrated knowledge in several fields, including coding. People shared print screens and recorded videos on ChatGPT, devising multiple programming language codes to resolve problems described in a chat. The AI capacity to generate code is not new. Another breakthrough announced in 2022 was the GitHub Copilot. Despite its controversial knowledge source, it suggests code based on comments you feed in and does that well.
Recently I've heard a colleague making a funny comment saying the SRE role is living on borrowed time with the rise of ChatGPT. Maybe a better statement would be that SREs will go obsolete with the deployment of AI models. I laughed, of course, since I disagree with that, but our daily chores will definitely change, and they must change naturally; this is how.
SREs have been using AI long before ChatGPT
As SREs, we already use AI a lot to make sense of vast quantities of signals coming from the full-stack monitoring platform. With the observability principle followed to the heart, systems generate metrics, events, logs, and traces. Understanding this big data from thousands of devices, services, applications, and systems goes beyond human capacity. AI learns normal system behavior and calls an SRE for further investigation when it detects an anomaly. It accomplishes that using techniques such as events correlation, blast radius, and topology self-discovery. This is often referred to as AIOps (IT operations through the application of AI).
Another example where SREs walk hand in hand with AI is detecting toil. Toil is any repetitive and manual task devoid of value for the system end-user, and SREs are eager to eliminate them as they impede IT operations from being scalable. AI can analyze events, incidents, and responses to recommend which ones should be automated through an automation framework. SREs develop code to make those target tasks automatic, or if not possible, at least automated.
领英推荐
Where can AI not help SREs?
Many times SREs are responsible for resolving unprecedented problems. First-of-kind issues or weaknesses are not as rare as we like them to be. There's no AI model trained for such cases, although it can help with the data science perks required to troubleshoot such scenarios. Nevertheless, SREs need to figure out what's happening and why. After the causes are determined and actions established, SREs can feed back the AI model, so it keeps learning.
What's a good service level and what's not is subjective enough to exclude AI. SREs have trouble establishing service-level objectives (and agreements) with the system users. A two seconds delay may be good enough for one person, but it's an eternity for another one. After all, the old good sense, empathy, and negotiation are still there.
So, will ChatGPT replace SREs?
In the short term, absolutely no. In the long run, it's possible to train ChatGPT with some site reliability engineering domain knowledge, but it will never replace a full-fledged SRE. Technology is ephemeral, while principles and core skills last much longer. SREs must keep up with the technological evolution but believe in their professional ideals. ChatGPT should never be an enemy but another tool in the SRE utility belt.
Intrapreneur & Innovator | Building Private Generative AI Products on Azure & Google Cloud | SRE | Google Certified Professional Cloud Architect | Certified Kubernetes Administrator (CKA)
1 年That's precise Rod Anami , it will take time for AI to catch up .Even if AI can catch up with the skills It should take some time to reach a confidence level that people trust it blindly with their production environment . In a way ChatGPT will empower SRE's by enhancing our skills , speed and domain expertise while making decisions . In the long run over time we may not see things and roles as it's present now .