Doing our bit for data quality
Russell Gammon
Chief Solutions Officer | Building award winning generative AI tools for Tax
[As per normal, these are my views only, not those of the organisation that I represent...]
COREP is pretty much BAU now. Figures are produced by a number of systems, processes, people and in a number of ways. If you take the first COREP filings from Q2 2014, and compare them to COREP filings from Q4 2017, I have no doubt that the overall quality has improved immeasurably. Early COREP filings suffered from a number of basic errors - missing filing indicators, reporting figures in thousands rather than units, negative capital figures; you name it - it probably happened!
Not only have firms upped their game, but over the last 3 years, a fair amount of progress has been made on the regulator side. The ECB have bought in a number of plausibility checks that they run on the data. Around 150 sense checks of the figures. The UK regulator also appears to periodically run their own validity checks on certain filings too, over and above the EBA checks. Our clients now have these rules running as standard on their submissions, so as to alert them not only to GABRIEL auto-checks, but also to checks that may be run on their data further down the road.
More is planned; of particular interest in the UK is this short note put out by the Bank of England, who have "proved the concept" regarding better, more streamlined analysis of the data they receive.
This is all good news. However, there is way more that can be done.
A lack of validations
Using the EBA validations as an example, there are thousands of validation rules. A significant volume to manage and ensure are correct (hint: often, some aren't). However, they certainly aren't an exhaustive list of all of the validations that could be in place.
For example, there has never been an EBA rule that requires that the "Total" sheet in C07.00.a to be the sum of the associated breakdown schedules. However, in 99.99% of cases, it really should be.
Moreover, the general approach with new templates being bought in by the EBA seems to be "add the templates now, put a few basic checks in, but don't "over-validate" from day one. Once a template then settles, validations can be added.". However, that leaves the firms to interpret their own data quality checks when analysing the templates.
In addition, firms often want to run basic checks such as:
> Are my headline capital figures greater than £x and lower than £y?
> Have I completed my buffer figures?
> Are my subtotals rolling up correctly, if not covered by EBA rules?
Typically, this involves manual checks (think "IF(A=B+C,"","FAIL!") or similar in Excel). These aren't easy to maintain and can be quite 'up-stream' from the actual reporting, so might get ignored by the time XBRL is actually being generated.
Our solution
We've been working on this one for a while, so forgive my level of excitement. It's also, as far as I know, the only platform that does this that way we've done it. Our next major release will feature something called "User Validations". We took a look around the market at some ways that other vendors add data quality checks. These ranged from hard-coded calculations (not an ideal way to do things), to on-screen Wizards to add A + B = C.
We decided to do it differently.
Over the last 4.5 years, the EBA and more recently EIOPA, have defined their own syntax for expressing validations. Other regulators are also tending to use this syntax when publishing their own taxonomies, particularly the Bank of England. Every time the regulators now update the rules, they follow that established syntax, which helps the software vendor community. The most recent set of said EBA rules, for example, are here.
The syntax that has been established is pretty extensive and ranges from basic arithmetic, checks that cells are/aren't completed, through to some quite complicated functions including filtering, IF, OR, AND, NOT, LIKE, THEN, etc.
So, we thought we'd allow users to use all of that lovely syntax to define their own rules. The result - they can upload their own set of validations to run on the data.
So something like:
(Looks a lot like the EBA rules...)
Becomes something like:
Some fetching green cells which are validating based on your own logic, over and above the EBA checks. Not only will this calculate (so you don't need to worry about actually linking from Excel, if you don't want to) but will validate too, per your logic.
Now our clients will be able to run these checks in an automated fashion, and construct them in a familiar format. They run natively within K-Helix, so there is no need to "run a validation process" (which I hate the idea of...)
This has a wide range of applications. For example, how about running data quality checks on XBRL generated by another system? How about creating test data for new taxonomies in an streamlined fashion? I'm sure there are other uses for a generic formula-parser...
So, in around 6 weeks time we'll be launching this out to the market. I can't wait to hear initial feedback and see how this ends up being used in practice.
Managing Director
6 年(sadly) I'm looking forward to this! This will be a useful update, thanks Russell