What is Surface Automation, really?

Huy Trinh

Lead Developer (Full Stack/Azure/.NET) @ Medline

发布日期: 2017年9月28日

Last week, when my team was having the final prep before a presentation with a group of senior officers from Operations of a major health insurance company, a teammate asked me:

- Hewitt, I saw you put Surface Automation and Frequency of Screen Changes as two separate drivers in our development effort assessment. Explain to me, why are these two mutually exclusive? and why are these two having the highest weights?

I put my technical thoughts into words without thinking about my teammate's business background:

- Surface Automation is an image-based approach to automation. Whatever keys (alphanumeric or cursor) you sent to screen using this approach are, in essence, sent to an image or a portion of the screen. Frequency of Screen Changes, by itself, is a variable in estimating development efforts, because developers don't need to use Surface Automation and still have to spend more time if the number of changes in screen is high in a process.

My teammate looked as if he got even more confused:

- What's image-based approach? Isn't that supposed how most RPA tools typically work?

- No, typically, most programs will communicate with each other at API level.

- What is API?

- ....

I realized my teammate's lack of fundamental technical understanding behind RPA tools was not unique. I did not know it either just a year ago. Neither did many folks I've worked with so far.

That itch is motivation for me to dig deeper into the tool documentation and venture an answer that is more understandable to many professionals learning about the RPA technology.

So let's begin...

In most modern business applications, there are buttons, fields, check boxes, etc. Imagine this surface of a application as a landscape made of those elements, being arranged such a way for users to click, type and check for a business purpose. Those are, user interface elements aka UI elements.

What's API approach?

Using some convention such as Win32, Active Accessibility or HTML, Blue Prism (BP), can, to some certain extent, extract the "address" of the UI element being captured or spied and suggest options for users to choose what type of actions BP can do on it.

It's similar to in accounting and law, a professional refer to a standard or a law by giving you a reference in a form of notation. Looking at the notation, you can tell quickly the law domain, section, chapter, sub chapter, paragraph, subparagraph and bullet point of the law being referred to. Any trained professionals can tell how helpful the notation is in their law research process.

Similarly, these "addresses" give BP the ability to access the UI elements in a split of seconds as the code runs. Once those "addresses" and the actions have been stored in BP, they are "remembered" or "trained".

A potential downside of this approach is the inherent limitation of the convention mentioned above. For example Win32 is .NET and C-based, so you can't expect it perform well in SAP, a software written in ABAP and Java. Active Accessibility is aging and receiving less support. HTML is only applicable to application hosted in web browser.

What's image-based approach (Surface Automation in BP terms)?

In some situations, BP is unable to use those "addresses" reliably with available conventions, the image-based approach is the go-to alternative. In this approach, BP treats the landscape as a smooth, featureless interface.

In another word, BP treats the interface as an image. Thus, fields, check boxes and buttons are identified through their relative position to the outer frame instead.

As you may not know, one (x,y) coordinate represents one pixel on the screen. Then, two (x,y) coordinates make up the position of a rectangle by referring to the higher left end and lower right end point. One rectangle make up a region. And with region, users essentially can do two types of action: press keys and clicks. To put it simply, BP will have to be "trained" with the series of x,y coordinates and the type of actions to perform on those regions.

An analogy for this approach is a pilot being tasked with sending humanitarian aids to a target from high above. He is informed with the geographic coordinates of the target. Then factoring his directional speed and wind directional speed, the computer gives him the coordinates and time he must drop the aids.

Why is Surface Automation sub-optimal?

Come back with sending human aids example and let's compare two different approaches, which represents Surface Automation and API Automation respectively.

While Surface Automation is similar to sending from high above, API-based Automation approach is similar to having a worker deliver the aids from the ground, identifying targets using street address or a "remembered" path.

However, using coordinates to send clicks and keys underlines a few assumptions developers must be cautious about.

First of all, the position of the region trained as an UI element must not change between the time of configuration and deployments. In the world of software development, UI interfaces change continuously. That means developers will have to spend time "re-train" the tool once it happened.

Second, screen resolutions may differ from environment to environment. A pixel noted as (200,100) in a 1024 x 768 and in a 1920 x 1080 screen means two different places.

Third, in most cases, when developers have to use Surface Automation, it often involves more complex setting of time breaks between actions and knowledge of application shortcuts. Therefore, more advanced skills are required to deal with Surface Automation.

For those reasons, Frequency of Screen Changes and Surface Automation are mutual exclusive and should top the list of factors in assessing development efforts for a project.

Comments are welcome!

[Side Note]: Today, September 28th 2017, Blue Prism announced the release of version 6.0. The tool claims to include more features in Surface Automation that increase performance and reduce complexity for users in designing and building. As a developer, I'm more than ecstatic about this move >:)

要查看或添加评论，请登录

Huy Trinh的更多文章

First-principles thinking and The Algorithm at Tesla, SpaceX

2023年12月1日

First-principles thinking and The Algorithm at Tesla, SpaceX

I've been reading a new biography of Elon Musk, by Walter Issaccson, that chronicled Musk's dramas and successes since…
Developer, what business are we really in?

2023年11月28日

Developer, what business are we really in?

My inquiry into developer's values and learning to ask better questions. I would assume almost every developer know why…
Two views of low-code development

2023年7月31日

Two views of low-code development

Back in 1975, Prof. Dijkstra wrote, "In the world around us we encounter two radically different views of programming.

1 条评论
On queue and queue designs

2022年4月16日

On queue and queue designs

My question to a developer candidate is intentionally vague: "How do you design an RPA solution to replace the Mail…

4 条评论
On configurability or configuration driven development

2022年3月19日

On configurability or configuration driven development

Continuing the discussion of best practices in RPA development, I am going to talk about configurability or perhaps…
On writing small workflows

2022年3月4日

On writing small workflows

I never thought of writing about best practices in coding RPA until very recently my boss asked me what set of…

2 条评论
Development Experience - a unique approach to evaluate RPA vendors

2020年3月25日

Development Experience - a unique approach to evaluate RPA vendors

As Robotics Process Automation becomes an increasingly popular tool in enterprise digital transformations, competition…
The Three Thieves in Scaling Up RPA

2019年1月21日

The Three Thieves in Scaling Up RPA

Every manager would know the challenges of managing a five-people team are very much different from those in…
Lessons learnt in 49 technical design review sessions

2018年6月16日

Lessons learnt in 49 technical design review sessions

I admit I am a fan of the Shark Tank show and much of the ideas mentioned in this writing are inspired from my…
What to look for in a (Blue Prism) code review

2018年2月6日

What to look for in a (Blue Prism) code review

So here you are. After weeks of painful and confusing perusal of Blue Prism Foundation Training and guides, after…

4 条评论

See all articles

Huy Trinh的更多文章

First-principles thinking and The Algorithm at Tesla, SpaceX

Developer, what business are we really in?

Two views of low-code development

On queue and queue designs

On configurability or configuration driven development

On writing small workflows

Development Experience - a unique approach to evaluate RPA vendors

The Three Thieves in Scaling Up RPA

Lessons learnt in 49 technical design review sessions

What to look for in a (Blue Prism) code review

社区洞察