Robotic Process Automation: Think Resilience

Robotic Process Automation: Think Resilience

I really wish that bots are intrinsically resilient, able to easily withstand difficult conditions and changes.

You can obviously build technical “internal” resilience within your bots by optimising your coding (or let’s better say over-engineering).

You can have the best developers, the most intuitive development consoles or the most innovative bot development standards...But the big problem with RPA is that the operating effectiveness of bots significantly depends on the stability of the target processes, data format and supporting systems/ applications.

If your bots are not resilient to these “external” changes, your RPA initiative will probably be a failure (as you will slowly but surely start spending a lot of time, effort and money recovering from bot failures rather than scaling). Building external resilience into your bots should simply be a key design principle, consistently followed and enforced.

A bot is deemed resilient if it is able to withstand or recover quickly from difficult conditions. So how can this be achieved?

There is definitely the usual “Agile” and “DevOps” aspects. Absolutely agree as long as it is done properly. Agile and DevOps mindsets (and delivery methods) help address a key imperative of RPA effectiveness: speed of deployment of bot re-calibration activities, which is critical for fast recovery.

The goal and benefits (and pitfalls) of applying Agile and DevOps to RPA have been addressed in multiple articles so I will not expand on this.

The purpose of this post is to focus on 2 other key aspects.

1. Deployment of a continuous monitoring system that is an integral part of your RPA environment

The goal here is to detect changes to the external context in order to re-calibrate your bots as quickly as possible. The context in RPA refers to processes and systems.

Changes to the context should be continuously monitored in order to detect adverse events that could potentially impact the operating effectiveness of your bots. The aim here is to quickly apply the necessary corrective actions to your bots in order to avoid significant disruption. Speed of reaction is key so you need to ensure that your bot monitoring hubs are reactive and close enough to the relevant bot development factories/ teams.

Automated change monitoring mechanisms are usually more reliable than manual ones (as they require no human intervention and are less prone to human errors).

From a system monitoring perspective, I have highlighted 2 illustrative examples of automated continuous monitoring mechanisms in one my last post RPA: 6 checks that you must perform before releasing a bot in production.

  • Execution of “Replicahousekeeping bots from the RPA QA Control room against QA transactional systems/ applications. These bots are a copy from your bots running in production and will be able to detect a change in the QA transactional systems that will soon impact the production ones. These bots will send early warning to RPA development teams and speed of reaction is crucial in order to undertake the necessary re-calibration activities to your impacted bots.
  • Definition of events and execution of monitoring rules/ scripts directly in the transactional systems to detect structural or functional UI changes (i.e. if you are an SAP centric organisation, you can leverage the Continuous Monitoring platform in SAP Process Control application). Let’s say as an example that a specific screen layout has been modified. One new field has been defined as mandatory and as per the existing coding, the bot is not considering this field. Well the bot will fail due to the fact that entries will not be saved/ submitted. Automated rules that monitor QA transactional systems can flag this change (configuration monitoring) and send an alert the relevant robotic operation centre(s).

From a process monitoring perspective, I must admit that it is quite challenging to automatically detect changes to process/subprocess. You can however deploy some detective mechanisms that will help you flag potential changes. I have highlighted 3 illustrative automated continuous monitoring mechanisms:

  • Execution of “Parametrisationhousekeeping bots from the RPA QA Control room against QA transactional systems/ applications. This time, bots will capture on-going changes to data sources that store the parameter values used during bot execution (such as organisational elements, object types, etc.). These bots will allow to constantly adjust the scope of your bots. As an example, this type of housekeeping bot will continuously monitor any addition or deletion done to a central configuration table in SAP that stores all company codes per country code. If a new company code has been created under a specific country code, the bot will send an alert to a designated bot operator in order to evaluate the need to manually re-adjust the parameter lists of impacted production bots in order to extend the scope of bot processing.
  • Execution of “Document Versioninghousekeeping bots. A change to a given process/subprocess is previously reflected by an update to the process design document or the standard operating procedure. These documents are usually stored in a formal enterprise repository (a document or content management system, a shared drive, etc.). All these documents use version control in order to track changes - by allowing date/time stamp. The “Document Versioning” housekeeping bot can continuously monitor changes to the version of key design documents that govern bot execution.
  • Execution of process mining routines allowing to flag process deviations and variations. This will provide fact-based evidence that something within your process/subprocess has actually changed. Continuous process mining combined with “Parametrisation” housekeeping bots and monitoring rules is the killer combination, allowing to analyse the “transactional” effect of parameter and structural/ functional UI changes.

You can even go a step further here, if and when possible (this is however way more complex). Instead of just sending alerts, Replica and Parametrisation housekeeping bots can actually invoke “Readjustmenthousekeeping bots to automatically re-calibrate production bots (i.e. adding a new organisational value to a specific field in the parameter list of a bot).

As part of the bot design work, you should take the time to define what I call KCIs (Key Change Indicators) at process and system levels. Each KCI should be assigned to dedicated monitoring owner/agent.

2. Deployment of a flexible human-bot collaborative control

The goal here is to decrease manual recovery workload during downtime and/or scale your bot workforce during peak times.

There is prerequisite for this: a Microbots architecture for RPA must be deployed, where bots are built up of multiple self-contained and independent blocks/ actions (equivalent to individual automated tasks).

These automated tasks are then combined together, scheduled to run in an orchestrated way against single or multiple runtime environments (they could even use different RPA technologies)...but they remain independent and this the key for resilience.

?Bot scheduling/ orchestration becomes essential here and your robotic operation centre(s) should master this.

There are way more benefits in adopting a Microbots architecture but the purpose of this post is to focus on the resilience dimension. I invite you to read 2 amazing posts for more details on Microbots: Microbot architecture for RPA by Robson Fernando Veiga and Microbots: composition and orchestration by Roger Berkley.

A resilient RPA environment should react to changes by optimising the task allocation between human operators and bots during downtime

As a simple illustrative example, take a situation where you have a bot made of 3 automated tasks: Extract report A from bespoke legacy Contract Management system (task 1), Transform data and populate file B (task 2) and Enter data from file B into SAP Contract Lifecyle Management system (task 3).

Imagine now that task 1 fails. If you don’t have a modular architecture, the bot will fail in its entirety as underlying automated tasks are interdependent (tasks 2 and 3 will not be executed unless task 1 is fixed). So you will first need to fix task 1 and re-trigger the bot again via a new scheduling event.

Task allocation between the human operator and the bot is sub-optimal given that human operator will need to manually execute tasks 1, 2 and 3 during the re-calibration of the bot (due to a change). 100% of the tasks will need to be executed manually during the bot recovery period.

In the case of a Microbots architecture, if task 1 fails, the human operator will manually execute task 1 during downtime and will let tasks 2 and 3 run automatically (given that individual tasks are initiated independently via different scheduling events).

Task allocation will be optimised in this case and the recovery workload on human operator is minimised: only 33% of the overall process/ subprocess execution will be done manually during downtime.

A resilient RPA environment should react to increased demand by easily scaling the bot workforce during peak time

Scalability is an attribute that describes the ability of a bot to grow and manage increased demand. A bot that is described as scalable has an advantage because it is more adaptable to the changing demands.

Let’s say now that the volume of data that need to be processed via task 3 has drastically increased. And you still need to enter all the data in a very restricted time window in order to meet business requirements.

In this case, task 3 only will be executed against multiple alternative runtime environments allowing to enter all the data via concurrent logon sessions. Without a Microbots architecture, the entire automation chain (all dependent automated tasks) would have been initiated.

What other mechanisms are you deploying to increase bot resilience?

I would be very interested in hearing your thoughts. If you would like to receive my future posts then please follow me.

Opinions expressed are solely my own and do not necessarily express the views or opinions of my employer.

Sripad M.

Looking forward to Robotic and Agentic Future

5 年

Building external resilience into your bots should simply be a key design principle, consistently followed and enforced. Ralph Aboujaoude Diaz

Swati Sharma

Automation Strategist | Technology and Transformation

5 年
Ashish Manuel

Intelligent Automation | Senior Consultant

5 年

Great article. Idea of housekeeping bots to monitor changes is good but it brings with it additional overhead if not carefully done. Working closely with the target application teams and putting a process in place where any future changes planned are notified to the RPA team is something that has worked for us. That way you get enough time to prioritize and make the necessary changes.

Erik Beijer

Positief psycholoog. Bedrijfskundige. Mind-Body Practitioner. Business-IT adviseur. Ondernemer.

5 年

@Arjen, praat graag een keer bij. Manuel als vervolggesprek wellicht?

回复
Arjen Van Berkum

Chief Strategy Wizard at CATS CM?

5 年

You really should talk with Manuel Winkler

要查看或添加评论,请登录

社区洞察

其他会员也浏览了