Thank you IT Ops champions for keeping the world running before, during and after the Pandemic
Lisa Wolfe
Computer Scientist | Protect Your Spirit From Corporate Burnout Author & Speaker | Host of Finding Water - AI Podcast
In an AIOps, DevOps, or NoOps world IT Ops is here to stay
Lisa Wolfe?July 12, 2021
On the Road to Burn Out
In days gone past an IT Ops engineer would spend their day looking into performance and availability issues using dashboards from dozens of different monitoring tools.?The daily pattern was always reactive – Ops teams would be finishing up fixing their last issue, when an onslaught of 1000s or more events would make their way to Ops across many unrelated dashboards.?While the Ops team fixed a problem with network routers causing slow online ordering services the IT Service Desk would simultaneously be bombarded by calls and emails from employees that their VPNs were no longer working. ?All these issues could take hours or in some cases days to resolve, and day after day it was a rinse and repeat model across Ops and service desk teams.???The IT Ops world was reactive because there were no early warning indications that things were starting to go wrong – there was no way to catch an issue before it impacted users or the business.
There had to be a better way, a faster way to identify and remediate the issues and break this repetitive pattern that ultimately led many Ops teams to burn out.
The DevOps Solution
Patrick Dubois coined the term DevOps and it has been further popularized by Gene Kim, Jez Humble and team.?If DevOps was a piece of Jazz music it would have been a riff on a?methodology that transformed the manufacturing industry in the 1990s, described in the bestselling book: The Goal by Eliyahu M. Goldratt.?The premise of Eliyahu’s methodology came down to identifying the bottleneck in the overall supply chain and once the bottleneck was identified and removed everything could move much faster and the result would be – for example - to deliver 100 cars a day instead of just 70. ?DevOps applies a similar methodology to the software world which would inject speed into the Software development process by removing the bottlenecks and breaking down the wall between the Dev team (the coders) and the Ops team (the fixers). That’s where the DevOps methodology comes in.?The principle in a DevOps world is that if you're the coder you also own fixing the code when it breaks, as the saying goes “you build it, you run it” - no more tossing it over to the Ops team to worry about fixing it when or if it breaks in production.
Did this put the Ops engineer out of a job? ?Not quite.
Fast forward and the DevOps methodology took off, cloud migrations accelerated, DevOps tools spawned in the open-source community by the hundreds from both individual contributors and vendors, and microservices took hold.??And to support all the new DevOps tools and keep the teams moving at speed another new team emerged on the scene - the SREs (Site Reliability Engineers). ???Site Reliability Engineering originated at Google and is the cloud approach to operations which aims to fix issues by using software engineering and automation solutions, it effectively treats operations as a software problem.
?The NoOps Vision
In 2018 an article was published called ?“Stop DevOps Someone is Going to Get Hurt”, by Tomer Simon, Phd., Chief Scientist for Microsoft Israel.
In his article Tomer Simon says:
“Maybe the lack of trust between the software development team and the IT teams should not be solved by an interim and overly complex solution, but also perform this “digital leap” and skip DevOps altogether directly to NoOps.”
NoOps ???Okay is this where Ops is really out of a job ?? Tomer Simon says: “There will always be, most probably someone that is doing the ops. The question is who and where and how many. For cloud providers the ratio of IT personnel per server is hundreds times less than in regular companies. And most importantly if you develop on top of PaaS, then for you and your company there is no ops. I agree that perspective plays a role here.”
领英推荐
Realistically most folks and companies today are still working on getting the DevOps methodology in place and not ready to move to the ?NoOps vision. ?And Tomer Simon’s point about DevOps being overly complex resonates far and wide.?The DevOps methodology did remove the Dev constraints – thanks to infrastructure as code (IaC), no longer was Dev waiting on provisioning physical servers, test environments, or asking permission to grab cloud resources to spin up code, test it and drop it into production.??The constraints and the bottlenecks were removed for the Dev teams.?But the complexity and time for the Ops team was not mitigated and new challenges arose in oversight of configuration changes.
?No Matter How You Slice IT Ops is Here to Stay
There are new and different issues outside of the Dev teams and the application services they are responsible for.?Ops now needs to be concerned with things like less experienced dev teams racking up errors, leaving cloud resources running, cloud bills piling up, or dev team members using IaC to create configurations like inadvertently opening up cloud resources resulting in security vulnerabilities for the business. ?While they are not on the other side of the wall – the Ops team still has a vital role they play to protect the business, mitigate risks and predict and prevent issues from ever impacting employees and customers.
And for those businesses who are already well on the way to only using infrastructure-as-code, the changes occurring inside the CI/CD (Continuous Integration/Continuous Deployment) pipeline still need to be tracked –not by the Dev teams themselves because no one wants to slow them down – but automatically tracked so that when services do slow down and start to degrade, previous changes that occurred in the CI/CD pipeline can be reverted in real time automatically.?
?Predictive AIOps to the Rescue
?And to ensure the Ops team can thrive in a proactive environment and break the cycle of reactive burn out,?pending issues should be caught and presented to the Ops team before anything ever breaks. Predictive AIOps – which can take care of identifying thresholds and query patterns that were previously manual tasks - can now identify blind spots that the Ops team could never have anticipated.?Predictive AIOps can catch issues in real-time without being told what to look for in advance.??Manual processes should and can be automated and the data that is laying around in systems of record across the globe, both operational data and information about how infrastructure changes have been made through code, can be pulled into a single data model to aid with problem determination and predictive analytics, and keep digital products and services running 24/7.
Until the NoOps vision is real - IT Operations is not going away any time soon.?What will change is the nature of what this team does day to day and the quality of their work lives.??
Will it be:
Reactive IT Operations buried under a burden of streaming events coming from dozens of monitoring tools
or
Proactive IT Operations and a quality work life - ?possible, practical and realistic today with AI-Powered Service Operations from ServiceNow.
?
Great job with the article! the picture is priceless!
Business Owner, Fractional Sales Leader When Today’s Success Requires More Than Yesterday
3 年Love the "read" Lisa! And the pic of you at your desk reminds me of my 1st job in DP. Reel tapes, green screen, and punch cards. Remember those days?
great read and a terrific perspective!!! Christopher Brown