登录查看更多内容

Monitoring Headaches?

David Gerrish

IT Monitoring & Observability Engineer/Consultant with 30+ years of experience. Empowering IT Professionals to become Confident Contractors via the IT Contracting Academy Groups.

发布日期: 2022年10月18日

When infrastructure monitoring works, the business doesn't often notice, monitoring just works, the monitoring process works…

But when faults arise and it affects the business, then all hell can break loose…

So many companies I have discovered over the years don’t have a clear view on what they monitor in their IT estate, how they monitor it (IT?) and if indeed they monitor everything they should be monitoring in the first place…

This can create a headache… just one of those niggling little aches we all ignore from time to time and hope they will just go away...

Or they might have multiple tools, multiple teams, multiple managers - all putting their multiple opinions and requirements into the monitoring melting pot…

And this can create a real headache that won’t go away without intervention…

The pain can be alleviated though of course… with the right diagnosis… and the right treatment...

And it’s not always that the monitoring system itself that’s to blame...

Common symptoms I often see include:

No alerts were received when the fault occurred...
Alerts were received and weren’t actioned properly...
Alerts were simply ignored...
Maybe there were too many alerts and operators were overwhelmed - they "couldn’t see the wood for the trees”…
Maybe they just didn’t know what to do with the alerts...
The tools couldn’t monitor what was required effectively...
Maybe nobody asked for that fault conditions to be monitored in the 1st place...
Maybe the processes & procedures failed...

Sometimes it can be 1 or 2 of these symptoms, often it can be a lot, or even all of these issues…

And that can cause a real migraine

Many organisations of course evolve, having multiple disparate or overlapping systems, and only a few I have spoken to actually realise the problem is not the tools themselves, it’s often themselves as a monitoring function that are the true cause of the problems they are facing…

Blaming 1 particular tool and simply “buying more” doesn’t fix the underlying issues

Some organisations add to their toolsets and then don’t (or can’t or won’t) consolidate the remaining tools, leaving them often with a bigger headache than before...

It’s like putting a band-aid on when you have a splitting headache

Diagnosis is critical

Correct treatment is essential

Solving the underlying issues and establishing firm requirements is often what is required here:

Assess the current tools, monitoring, people, processes & procedures…
Identify quick-wins to relieve the immediate pressure…
Identify monitoring gaps and plug the gaps where possible as fast as possible…
Define the monitoring requirements...
Determine if the current environment can meet the requirements… and plan accordingly...
Refine the current monitoring environment to meet the requirements and re-identify any gaps...
Plan a consolidation and/or migration strategy if required...
Ensure monitoring is defined, implemented, documented, supported and trusted by all…
Ensure tools, people, processes and procedures support the monitoring function at all times..
Re-assess continually

“Oh but we’re all in the cloud now!” I sometimes hear...

This is another issue I have seen - a reliance on cloud or cloud-based tools…

David Cuthbertson 10 个月前

Alarm Receiving Centers - Operations, problems, and…

Albert Stepanyan 1 年前

How Cloud-First Addresses the Challenges of…

Ted Green 2 个月前

Just because it’s “in the cloud” doesn’t mean it can’t fail…

Just because a cloud provider might keep your server running in the cloud doesn’t mean it’s monitored how you need it monitored…

Even with monitoring capabilities supplied by a cloud provider, this might not be sufficient for the needs of your business...

You might have additional requirements, additional “things" to monitor…

Things that have been missed by vendors, bespoke requirements, additional monitoring requirements you would like to prevent outages, prevent chaos, avoid headaches…

I’ve also never been a fan of “top-down” views in isolation… something green on a dashboard is useless unless it’s meaning is understood and complete...

I believe a 2-way approach is needed - top-down and bottom-up

As I remember saying to a CTO years ago (back in 2006)…

“There’s no point in having a nice shiny dashboard with green traffic lights on it if those indicators are not a true and complete reflection of all underlying connected systems”

This is still true today, some 16 years later

Monitoring must be in place, complete, tested, and trusted 100% by all

Yes - 100%

If it’s not, then it’s not a matter of IF you will get a headache, but WHEN

Rip off the band aid and diagnose the root of the problem now, not when things fail

Avoiding pain is far easier than pain management in the long-run, and will save you valuable time, effort and money.

#protocol #itmonitoring #itinfrastructuremonitoring

Thank you for reading this. If you enjoyed it please click?LIKE?and click?SHARE?to share it with your network.

About the Author:

David Gerrish has been a successful IT Contractor since 1996. He has worked throughout the UK & Europe, and has contracted further afield in countries such as Hong Kong, Singapore and Australia.?He has worked with numerous blue-chip clients including Barclays, The London Stock Exchange, Hewlett Packard, Fidelity, Bupa, Cazenove and many more.

Dave is available now for contract roles and short-term monitoring consultancy.?Hire him for 1 day, 5 days, a week per month, or more...

If you would like to have a chat with Dave please email him at [email protected] to arrange a call.?

Monitoring Headaches?

David Gerrish

IT Monitoring & Observability Engineer/Consultant with 30+ years of experience. Empowering IT Professionals to become Confident Contractors via the IT Contracting Academy Groups.

领英推荐

更多精彩文章

社区洞察

其他会员也浏览了

"Navigating the Complexities of UPS Selection: A Critical Guide for Your Business"

Data Center Designed for Business Resilience and Growth (Part 4 of 4)

Infrastructure Interdependencies: Security, Safety, Risk & Resilience Complexities for Communities, Management & Protectors

Ensuring Continuity: UPS for Public Services

Maximizing ROI: Understanding the Cost Savings of Wireless Access Management

Guess What I Left Switched Off?

Planning for Emergency Situations: Why Reliable Fuel Storage is Key to Business Continuity

WI-FI MONITORING SOFTWARE: WHY IT’S CRUCIAL FOR ANY BUSINESS

High Availability & Fault Tolerance for Monitoring Stack

Network Infrastructure Overhaul for a Logistics Company in Warsaw

领英推荐

The Death of IR35?

2022年9月23日

Why it's Still a Life-Changing Career Move to Become a Freelance Contractor...

2022年7月8日

5 Misconceptions About Freelance IT Contracting

2022年5月23日

UK Business & Entrepreneurship

2021年2月19日

Restricting Yourself?

2021年2月2日

Providing Value through Values

2019年10月4日

Reasons You're Not Contracting Yet

2019年9月29日

Have You Ever Settled for 2nd Best?

2019年9月8日

TEMPLATE EMAIL for IR35 / OFF-PAYROLL TAX - ACTION REQUIRED

2019年9月3日

Pleasure or Pain?

2019年7月16日