Is OCR Technology cool again (in RPA context) ?
OCR Technology has been around for a long time. It has mostly been a part of various Enterprise grade Document Management and Data Capture solutions such as products from Kofax, IBM and many other Vendors. Most of these implementations are part of holistic solutions which may be bundled with Document Management, Data Capture, BPM and other solutions. Almost all of Fortune 1000 companies have some sort of implementations of these types of Enterprise software somewhere in the back office operations however many siloed Business Units are either not exposed to this technology or just find it hard to integrate or work with these large Enterprise Systems.
The “Shadow IT” revolution which is driving the adoption of “low code” and/or “no code” Platforms such as RPA, is challenging some of these Enterprise Software paradigms. Business is excited about building Robots to automate manual Business Processes and many of these processes involve structured, semi-structured and unstructured documents such as PDF documents, Driver’s license, W2s, Tax Returns, other types of manual entry data forms even including hand written documents. As RPA Automations evolve, integrating RPA bots to handle these types of Document Processing challenges will become very important (we already see that with many of our Clients).
In this article we are not advocating for the evolution of “Shadow IT” but merely talking about the reality of thousands of Bots being implemented within the Enterprise in the near future. As a company, we consistently advocate for an IT Governed and Business aligned model for large scale RPA Programs. However we do want to point out that yet another older technology i.e. OCR Technology is finding a new life within the RPA world. The Use Cases are plenty. For e.g. a Business Unit within a Bank working on Bots to handle various stages of a Loan file, may need to consume data from supporting documents such as W2s, 1099, Credit Letters, etc. The Business Unit may currently have Loan Officers or other Employees going over the supporting documents and classifying and extracting information so that information can be used as input into the rest of the business process. In this case, as you can imagine, the BU will resist engaging IT for implementing the larger solution to this problem but they would rather have a Bot which can handle the Automation including the data extraction from the relevant documents. The Bot could handle such a situation by receiving the relevant document through email (or any other way) and run the document through the OCR classification and extraction engine and continue the automation without any human intervention.
The solutions we see emerging are where RPA Vendors are offering roubust 3rd party integrations with OCR engines such as Abby, Google Tesseract and others. Another thing to note is that advancements in Machine Learning Technology has increased the accuracy of these OCR platforms considerably. Some of these Vendors not only offer on-premises software but also offer OCR on the Cloud where you just submit your documents and receive the extracted information through a well-defined API. It is important to understand that the OCR system needs to be pre-trained on a particular class of document or it would need to be trained as a part of the implementation.
Most of us have heard a lot about “Smart or Intelligent Automation”. We see the Document Processing, Classification and Extraction using OCR and one of the key requirements for advanced Automation use cases. The future Use Cases go far beyond simple extraction as we will eventually see NLP Technologies being utilized to make Business Processes even more autonomous. For now, there is a need for Bots to processes large number of standardized of documents possibly within every industry and the solutions to solve such problems are already in the market.
For near term ROI oriented use cases, we currently focus on making Robots more autonomous by integrating semi-structured or unstructured Data Cleansing (Rotation & Orientation), Classification & Extraction capabilities using Machine Learning enabled Vendor products both with robust pre-trained document types as well as situations where the products have to be trained with additional document types.
For more information, please reach out to [email protected]