Paying a premium to have a human on "standby"
When you dial 9-1-1, someone ALWAYS answers the phone. You may not ever call that number in your lifetime, but you know that if you ever needed to, you'd have support on the other end. To be able to have that security though, people have to get paid. You personally may or may not be paying for it, but somehow the folks that answer the phone, and those who run the operation, are getting compensated one way or another. They have to or else they wouldn't do it. And we, again, in one way or another, are paying to have that service available to us.
Take a look at your industry and think about similar services that you have at your disposal. In my line of work, I liaise with customers who want to have IT engineering support and they want to consume that in different ways. In the most basic form, they either want to complete a project or they want to call on a provider to dispatch an engineer to go to one of their sites and help with an IT issue (usually break fix). Either way, they are paying for something.
When we look at "projects" versus "on-demand engineering programs" though, you'll notice a key difference. One is a scheduled type of work (proactive) and the other is a non-scheduled type of work (reactive). Projects have definitive start and end dates, milestones that need to be met, and a well laid out plan of attack on how to complete it in the allotted time-frame and with the given budget. On-demand engineering is more reactive in nature, although it doesn't always have to be.
As IT leaders get more creative and efficient in managing budgets, accompanied by worldwide operations, they are moving from requiring more reactive work to more proactive work. Why? Well, one reason is because they have built in a lot more redundancy in their IT environment and don't necessarily need someone onsite within 2 or 4 hours to fix an issue. It's ok to show up next day. The other reason is $$$. It's more cost-effective for them to call that engineer out on a next business day (NBD) SLA rather than same-day. Why?
Humans don't like to be slaves to someone else's calendar and needs. We want a well thought out, meticulous view of what our week, month, year looks like. You can't have that if you contract with a company where your agreement with them says you'll be at x location within 4 hours whenever they call you. How could you plan to do things that you want to do with a contract like that? Everything from a week-long vacation to a routine doctors' appointment. When that phone rings, you better abide by your contract and get onsite. If you don't, and you do that enough times, your name will become tarnished in the marketplace and you'll lose work.
Why mention this?
Well, for companies who subscribe to an on-demand engineering model it's important for them to understand what they TRULY need. We see it all the time: "when we submit tickets for a dispatch, we need you onsite in 4 hours, all over the world, regardless of task." Can the customer have this type of service? Will a field provider commit to that? Absolutely. It will come with a price though. Why? Because the labor that the field provider has access to does not really want to do work that requires them to be on standby. Or in other words, let someone else control their time. So what's the solution?
A deskside technician who's hourly pay on scheduled work is typically $50/hour will now charge $100/hour for reactive-type work.
Do customers really need 4 hour SLA adherence at all of their company locations? Some do. Take someone like Facebook who can't afford for their platform to be down for even one minute of time. There's currently 2.3 billion FB users. If the site is inaccessible for any period of time, it's a travesty. We are a society of "want it now" individuals. We're impatience and quite honestly, we think something is terrible wrong if we can't access that photo of grandma from 1987 chugging a beer at a holiday party, exactly when we need it to see it. Same goes with the likes of Google, Instagram, and other platforms that have million sand billions of people accessing their servers everyday. But it's not just them. Think about manufacturers who have processes in place where each step relies on the step before it in the supply chain. If one link in the chain stops, it hinders the production of the final product, which impacts whether that product actually ends up in the hands of the consumer. Lost $$$ for the company. So of course, strict response times matter and are needed in certain IT environments.
However, even with the above mentioned companies, they may have a provider locked in to a contract who can be onsite in 2 or 4 hours, but they most likely never call them out for that SLA.
Two reasons:
- They know that is more expensive a dispatch than telling the provider they can have that technician arrive tomorrow instead of today.
- The company has built in redundancy in their environment where it's ok that the hard drive went down..they just fail over to one of the other hundred they have in place. Again, letting the technician come the next day to fix the failed hard drive is ok.
Where am I going with this?
When looking at a long-term IT operational strategy, it's important for leaders to first understand what they have, and then the services they'll require to successfully support those operations. With SLAs, you can go down so many different rabbit holes, but mainly you should be looking at it from the perspective of how much reactive work you'll require versus how much scheduled work you can place in to certain buckets. The goal being to get as much of your business to a "scheduled" model where the IT operations that support your business are not resting on whether or not Bob Smith in Paris, France shows up onsite in 4 hours when you call his mobile phone at 2am on a Saturday. You want to have contingencies in place, yes, however you also want to move the organization to a place where month over month you start consuming less and less 4 hour dispatches. (getting employees and internal users knowing what is really an emergency and what is not, is a separate topic).
So whether you're a new IT leader to the space or a seasoned veteran, I encourage you to take a step back and really learn about the IT operations that are already in place, the historical ticket data, your national and/or global footprint, and collaborate around what current state looks like compared to what future state IT operations SHOULD look like. The steps along the way will be difficult because people hate change (innately), but if you can strategically move your department, and eventually entire organization, to a place where there is redundancy built in around your systems, people truly understand what's an emergency, and then what cost implications there are when they call out for a 4 hour SLA versus a NBD SLA, you are going to become a much more efficient company that will recognize positives in many different areas.
Meet me here
704-659-2421
Project Manager - Migrations and Transformations
5 年Davis Fisher, that was a nice write-up.? Many thanks. Sometimes . . . it just pays to have at least one engineer on-site, 24x7 to handle most of the immediate stuff.? And then, if required, bring in another engineer or whoever, NBD. For 3 years, two of us managed over 500 servers that handled 4x stock trading systems, and it wasn't that hard.? Yes, we were dedicated and were able to handle stuff from home or remote anywhere. And we did the architecture, the OS builds, the rack designs, and the DC room splits for best redundancy, so little things did not bother us.? Like you said, build it good, and you can do NBD. Thanks for this post.