登录查看更多内容

Monitoring Tools Tailored to the Human

Mark T.

Senior Consultant at Microsoft

发布日期: 2018年12月20日

Once upon a time at a client, an example arose of a well-designed technical solution falling partially short of achieving its desired outcome due to an unanticipated people factor. Adjusting the technical approach to account for this factor helped the situation. What follows is inspired by this customer experience.

In order to move to a more pro-active approach to incident handling (for servers), a monitoring tool was designed and implemented to alert technicians not only when things go wrong, but also when servers become degraded, allowing admins to get out in front of, and (ideally) resolve, issues before they become, well…issues.

With this sort of tool, however, spurious alerts must be managed, else technicians view the alerts as “noise” and ignore them altogether. For this reason, tuning the tool is an important phase of the implementation; but there must also be a process change for performing routine work on the servers – placing the devices in maintenance mode. If a device is rebooted without first disabling the monitoring, spurious alerts may result - placing them into maintenance mode is the method to prevent such things.

So…the tool was implemented, tuned, and procedures modified to require technicians to access the monitoring console and place devices in maintenance mode prior to rebooting them. And yet, they did not do so – for some reason, and I haven’t yet divined why, some folks just feel it’s easier to close out alerts the next day than it is place the devices in maintenance mode (I think it may be a path of least resistance thing – logging onto the console is too much to ask). Technicians were patching and rebooting servers without maintenance mode, generating a host of spurious alerts. Reports contained inaccurate data, unnecessary tickets were created, time was wasted researching meaningless alerts.

A few of my teammates and I were chatting about this over lunch one day, and I put the idea out there about eliminating the “extra” step of accessing the monitoring console by providing a means for the server technicians to place the devices in maintenance mode right from the device itself – a script they could run right from the desktop. The result of this collaborative effort may be found here – a method for placing devices into maintenance mode remotely.

For Operations Managers, and for Event Management process owners, the reduction of those (say it with me) spurious alerts is a critical success factor. As engineers, we may design our tools and processes in such a manner that we believe we have provided everything needed to realize our goals, but do not forget the human factor!

要查看或添加评论，请登录

Mark T.的更多文章

Windows 11 in DoD/Fed - plan now!

2021年12月6日

Windows 11 in DoD/Fed - plan now!

Windows 11 is here! On a related note, Microsoft has announced that Windows 10 will no longer be supported after…
Educate, don’t subjugate – increasing vaccination rates by empowering individuals (with data)

2021年10月20日

Educate, don’t subjugate – increasing vaccination rates by empowering individuals (with data)

Every person on this planet is an artist in a way – a creator. Whether it be raising beautiful children, providing…

4 条评论
D&I: it's not political, it's human (and tips on how to talk about it with civility)

2021年7月29日

D&I: it's not political, it's human (and tips on how to talk about it with civility)

All major religions, while using different approaches, share certain teachings – even a cursory exploration of…

13 条评论
Laying the Groundwork for a Service Management Approach (Part 6 of 6)

2019年8月16日

Laying the Groundwork for a Service Management Approach (Part 6 of 6)

The final entry in this series on transforming technical teams covers some ideas for follow-on activities that may be…
Laying the Groundwork for a Service Management Approach (Part 5 of 6)

2019年7月29日

Laying the Groundwork for a Service Management Approach (Part 5 of 6)

Part 1 of this series introduced a methodology for fostering a service management approach at the technical team level…
Laying the Groundwork for a Service Management Approach (Part 4 of 6)

2019年7月24日

Laying the Groundwork for a Service Management Approach (Part 4 of 6)

Part 1 of this series introduced a methodology for fostering a service management approach at the technical team level…
Laying the Groundwork for a Service Management Approach (Part 3 of 6)

2019年1月7日

Laying the Groundwork for a Service Management Approach (Part 3 of 6)

Part 1 of this series introduced a methodology for fostering a service management approach at the technical team level…
Laying the Groundwork for a Service Management Approach (Part 2 of 6)

2018年12月28日

Laying the Groundwork for a Service Management Approach (Part 2 of 6)

Part 1 of this series introduced a methodology for fostering a service management approach at the technical team level.…
Laying the Groundwork for a Service Management Approach (Part 1 of 6)

2018年12月20日

Laying the Groundwork for a Service Management Approach (Part 1 of 6)

Service Management efforts tend to be driven from the top down, but I have seen more often than I wish these efforts…
ITIL and the DoD Risk Management Framework (3 of 3 - Practical Example)

2018年12月19日

ITIL and the DoD Risk Management Framework (3 of 3 - Practical Example)

Part 1 of this series was an overview of the Department of Defense (DoD) Risk Management Framework (RMF). In Part 2, we…

See all articles

Monitoring Tools Tailored to the Human

Mark T.

Senior Consultant at Microsoft

Mark T.的更多文章

社区洞察

其他会员也浏览了

The four main pillars of Kaseya 365, the best solution for IT management, security, backup and automation

The Importance of Monitoring and Alerts in IT Infrastructure Management

What Leaders Should Know About Security Integration Benefits

Why Managed Services is about more than just keeping the lights on

CMMS Security: Protecting Your Data and Assets

Optimize Network Performance Management with AppNeta

The ROI of Managed IT Services: How to Maximize Value for Your Business

Efficient IT Maintenance & Support: A Pillar of Excellence at Inspironlabs Software Systems Pvt Ltd

The Pulse of IT: Embracing Daily Monitoring for System Health and Continuous Improvement

Enhancing Operational Resilience with Optimised IT Notifications for Data Centres

Mark T.的更多文章

Windows 11 in DoD/Fed - plan now!

Educate, don’t subjugate – increasing vaccination rates by empowering individuals (with data)

D&I: it's not political, it's human (and tips on how to talk about it with civility)

Laying the Groundwork for a Service Management Approach (Part 6 of 6)

Laying the Groundwork for a Service Management Approach (Part 5 of 6)

Laying the Groundwork for a Service Management Approach (Part 4 of 6)

Laying the Groundwork for a Service Management Approach (Part 3 of 6)

Laying the Groundwork for a Service Management Approach (Part 2 of 6)

Laying the Groundwork for a Service Management Approach (Part 1 of 6)

ITIL and the DoD Risk Management Framework (3 of 3 - Practical Example)

社区洞察

其他会员也浏览了

The four main pillars of Kaseya 365, the best solution for IT management, security, backup and automation

The Importance of Monitoring and Alerts in IT Infrastructure Management

What Leaders Should Know About Security Integration Benefits

Why Managed Services is about more than just keeping the lights on

CMMS Security: Protecting Your Data and Assets

Optimize Network Performance Management with AppNeta

The ROI of Managed IT Services: How to Maximize Value for Your Business

Efficient IT Maintenance & Support: A Pillar of Excellence at Inspironlabs Software Systems Pvt Ltd

The Pulse of IT: Embracing Daily Monitoring for System Health and Continuous Improvement

Enhancing Operational Resilience with Optimised IT Notifications for Data Centres