Using text analytics in Internal Audit

Using text analytics in Internal Audit

What are the current challenges?

Many organisations are currently undergoing large transformation projects to put data at the forefront of their processes; trying to create reliable golden sources of data, tracing data lineage front to back, and monitoring data quality with a fine-toothed comb. However, this is often focussed on only one type of data - structured data - which is data that is organised in a formatted table/repository.?

This is no different for internal audit (IA) teams who tend to focus on analysing tables of data rather than considering unstructured data, such as documents, emails or website content when performing data analysis. Of course, a different type of data needs a different analytics toolkit which is where text analytics and natural language processing (NLP) come in.

How will this benefit my function?

It’s rare to perform an audit where the key controls and processes are recorded in clean structured datasets. It’s more likely that there will be a mix of structured and unstructured data sources ranging from policy documents and meeting minutes, to email chains and client letters. These documents can be hundreds of pages long, yet the meaning and effectiveness of controls can hinge on just a handful of words or phrases within the documents. Using text analytics can provide precision in review and remove the need for IA teams to read through swathes of documents. In other words, the benefits you can achieve with conventional data analytics in terms of 100% population testing, efficiency in testing and increasing the accuracy of testing in complex processes, can be achieved against a whole new population of controls and audits.

What does this look like for IA?

Text analytics and NLP are broad subjects and there are lots of different techniques that can be leveraged to provide value to IA. Here we will touch on a few tangible examples, ranging from the simple and effective to more complex techniques that can add huge value:

Data extraction: before you get the chance to analyse the content of your unstructured data, you’ve got to get it into a format in which you can use it, which means bringing it into a programming language that will allow you to analyse and manipulate the data. Typically, Python programming language is used for text analytics and NLP as it has a lot of libraries to help you perform analysis, however, there are plenty of options out there with programming languages like ‘R’ or out of the box software. These toolkits will allow you to extract text and metadata from hundreds of mixed media (PDFs, PowerPoints, Word etc.) documents at a time, ready for onward analysis.

Keyword/phrase extraction: while at the easier end of text analytics, this technique can increase efficiency in audit testing. If you’re reviewing lots of documents for specific forbidden wording, sentences containing key words or patterns like email addresses or customer information, then this can save hours of time from manual review.?

Document similarity: ensuring that internal policies are compliant with regulation and guidance can be a time-consuming process that is liable to human error when only one word here or there can differ, but have a great impact on the control. By analysing the semantic similarity of each part of internal versus external policy using NLP, you can highlight specific gaps or deviations, allowing the auditor to focus on the ’so what?’

Part-of-speech analysis: when it comes to meeting minutes and governance reviews it can be important to identify who said what and when, track actions and monitor tone/sentiment. Using part-of-speech (POS) analytics you can identify colleague names and pull out all of the items they’ve suggested, agreed with, or countered by looking at verbs and associated with them. Perhaps you don’t just want to see if a particular project was discussed but also what was said about it, again, POS can help you do this by extracting all of the adverbs or adjectives associated with a particular word or phrase. Bringing all of this together can quickly summarise lots of documents into a concise dashboard that allows auditors to pinpoint parts of documents that are riskier or relate more to the scope of the audit.

These examples provide just a small flavour of what can be done with text analytics in an audit context. There are plenty of other ways you can add value to your audit function using these techniques, and when combined with more traditional data analysis, you can improve the quality of your internal audit and bring efficiency gains to much of your audit plan and activities. As IA functions continue to explore new ways of utilising analytics within their toolkit, text analytics is one that we’re going to see more and more.

Note: The views reflected in this article are the views of the authors and do not necessarily reflect the views of the global EY organisation or its member firms.

要查看或添加评论,请登录

Jonathan Roffey的更多文章

社区洞察

其他会员也浏览了