Perusing Observability and AI
Over the years I have often heard engineering leaders say some variation of: “I want an observability solution that simply tells me when there’s a problem, where and what the problem is and how to fix it…or, better yet, fix it for me”.? Before GenAI hit the scene, I usually replied with the merits of a full-stack approach with OpenTelemetry and how the very nature of the OpenTelemetry distributed tracing specification provides the missing context that has kept observability solutions from getting to that next level of quickly spotting when/where a problem is happening.? Now, I maintain the above position but combine it with the promises of GenAI (hype aside) and existing AI capabilities and I think it's actually plausible we may one day live in a world of closed-loop observability.
Until then, what should IT and Engineering leaders make of Observability?? (Especially given the general immaturity of observability across many enterprises overlayed with ever-increasing MTTR/MTTD metrics, as evidenced in the 2023 State of Observability Report/Survey.)? I spoke on this topic a few weeks ago at the C2C event in Boston where I surmised that while GenAI has provided a glimpse of what may be possible one day, we shouldn’t lose sight of the recent developments in observability (including existing forms of AI); that the pieces are in place today for enterprises to begin/improve upon their observability journey in meaningful ways, such that GenAI will merely provide gravy on top.?
The long-and-short of it is that cloud propensity/maturity, evolution of IT Ops teams > SRE/Platform teams, fragmentation of observability practice/tooling, maturity of OpenTelemetry and the breadth of analytic capabilities now available in many commercial observability platforms (including OOTB AI-directed troubleshooting/alerting/correlation) creates something of a perfect storm that sets the stage for any enterprise to implement a modern and effective observability practice. I plan to expound upon this in a future post but for now, a few other points/takeaways worth noting on this topic:
领英推荐
In closing, I"ll share a recent snippet from a conversation I had with a good friend who runs SRE at a global enterprise:
Me: So, are you excited for what GenAI can bring to your observability practice?
Friend: Ha. We've got enough to worry about just getting our data strategy right. For now, just give me OpenTelemetry, OOTB dashboards/correlations/alerts and some good search as a fall back, and we'll be more than dangerous. But yes I'm excited about GenAI. (Paraphrased)
DevSecOps Evangelist Customer-Centric Open Telemetry Trusted Advisor Thought Leader Observability
1 年Amazing read so well articulated too.
Records Management Professional
1 年Fascinating article. Thank you Rob!
Vector DB, AI, ML, LLM | Sales @ Pinecone
1 年Have seen a lot of really cool stuff on this. At least 4 of the major players introducing GenAI into their solutions, already. Excited to see where it goes
COO at Respell (acq Salesforce)
1 年Stud ??