Your critical system is down, and vendors are unresponsive. How will you keep operations running smoothly?
What strategies would you use to maintain operations during a critical system failure? Share your innovative solutions.
Your critical system is down, and vendors are unresponsive. How will you keep operations running smoothly?
What strategies would you use to maintain operations during a critical system failure? Share your innovative solutions.
-
In a situation where a critical system is down and vendors are unresponsive, it’s essential to stay calm and focus on what can be controlled. First, communicate transparently with your team and stakeholders, providing updates and contingency plans. Mobilise internal resources to assess the issue, leveraging backups, alternative systems, or manual processes to maintain essential functions. Escalate with vendors via all channels, including upper management, while documenting all actions. Building resilience means having strong contingency plans and adaptable teams to minimise disruption.
-
In this type of situation we are supposed to focus on important task as a priority. Checking for workaround or rollback process to keep it running. Handling things manually or using backup systems. Keeping everyone informed about the situation to avoid havoc and also the remediations are being deployed . Collaborating with teams across the board to enhance response time can act as a great help in this situation until vendor response or we are on track.
-
For any critical system classified based on AICT matrix, We should have redundancy on place while face such a situation of failure. no direct vendor dependency. we should have control to activate the redundancy means - DR / activate standby to business to operate. The local site engineer and commander should be well trained and mock must be performed such a set of disaster and record it whether policy and procedure to save from business impact such and save ourselves from the failures not becoming business impact with major looses.
-
My old manager asked me the same question during an interview for promotion - "all hell is breaking loose, what is the first thing you do?" - first thing is GET AWAY from the keyboard and take 5 min to think, then gather information, Triage the problem, plan next steps, execute - how many times does jumping in and typing cause more issues. Best way to beat this problem though is to have a strategy, invite a few vendors in and get them to map out solutions to a lot of these problems - NOTHING replaces advanced planning, DR strategies are ok but need to test and recover - Dual vendor is usually pretty good tool to make sure vendors are responsive.
-
If a critical system goes down and the vendor isn't responding, I'd first switch to backup systems or manual processes to keep things running. I’d monitor the situation closely, try to troubleshoot the issue, and use any available workarounds. Meanwhile, I'd keep everyone informed and ready for when the vendor becomes available again.
更多相关阅读内容
-
IT OperationsHere's how you can effectively troubleshoot and solve problems in IT Operations.
-
Technical SupportHere's how you can tackle technical issues after a failure effectively.
-
Technical SupportHow do you troubleshoot performance issues using benchmarking?
-
IT Operations ManagementYou're facing a critical system bottleneck. How do you manage it amidst multiple urgent tasks?