Archiving - The ultimate battle between Operations and Finance
Craig Watts
Believes Support is more exciting, dynamic and much more interesting than Implementation. But doesn't understand why others disagree.
As I sit here waiting upon a few indexes to create and a few investigative queries to complete on a database in drastic need of Archiving I got to thinking. Archiving has always been a battleground, we know we have to do it at some point, we can never decide upon the rules and often the 'rules' will not have the desired impact anyway. Because of this we more often than not leave the execution too late and the exercise becomes one where a standard approach cannot be utilised. There are also very few (if any) archive tools available within the Dynamics space. OK, so there is IDMF and I even managed to get it working once but there's very little else.
But I'm not here to tout a new tool (still working on it), I'm here to canvas opinion for a concept I've been putting forward for a long time with mixed success.
So here it is, the Operational database only requires a rolling 15 months of data. Anything older can be archived. Quick side note, I did say archived, not purged, that distinction will become important as we progress.
Let's cover why this cannot be done.
The first and probably my favourite is that we have a statutory requirement to retain financial data for a number of years, be that 5, 7 or in some jurisdictions I've even come across 10 years. Over the years I've had audit requests for data and specifically the ability to recreate an Invoice which were quite a number of years old. Everything I've ever read on this type of requirement is that the information needs to be available, absolutely nowhere have I read that it has to be available in the operational database. We can put it another way, back in the good old days of paper records (data storage was expensive then) what would happen when there was a request for an older piece of information? You'd send an Accounts Clerk down to the Archive to search through the boxes of files to find the information.
Been there, done that. Top shelf is always a fun activity.
There's a big difference these days, data storage is relatively cheap, the indexing is data driven so the random box of May 2002 invoices will not be found sitting between October and November 2008. So why not go to archived data for anything required more than 15 months old? Don't have to worry about any dust related allergies these days as the process is a lot cleaner.
Another common counter to limiting the amount of data available in the operational database is the Business Decision Making process. For example, we base our Sales Forecasts on data gathered over the past 5 years. But I'm really hoping this is not only using the Operational Database as this is primarily a read operation and an operational database should really be setup for write operations. This particular counter can be easily address by using a Data Warehouse and the condition for it's setup should be the use of deltas as opposed to a full build at time of extract. Although if you're in need of archiving you're likely there already.
A third counter I often hear is in relation to comparative reporting. Be that peak sales periods year on year or financial comparisons. As with the decision making process this is a reporting activity and should really be done utilising a Data Warehouse. In a Data Warehouse you can go back as far as you have available data for and it'll be much more efficient and streamlined with the proper indexing profile.
Why 15 months?
That's an easy one, it's to cover a Financial Year and ensure it can be closed out, audited and adjusted properly. If it wasn't for that I'd likely shoot for oldest aged stock item or oldest open project.
Why bother?
You probably knew I'd loop back eventually. It's about System Performance and the ability to make performance improvements. Let's cite an example, I had a client with an excessively large database, say for example, 50Tb. 90% of the data was in 10 transactional tables and of this about 70% of that data was not actually data but indexes. When I tried to create an index on one of these tables it ran for 2 days before the process needed to be killed. You also have database maintenance challenges as rebuilds are borderline impossible and auto statistics will rarely fire. Manual statistics can take days to complete. Simply put it is a challenge to maintain and improve the thing.
What would happen if we implement the 15 month archive?
Let's assume they had 6 years of data in the system (it's actually more but I'll make the mathematics simple). We could expect the database size to be reduced to around 10Tb, which would then allow us to address the indexing ratios and maybe pull another 2-3Tb out. So our operational database size reduces dramatically. As do the backup storage requirements. The only thing which will see an increase, and this increase should be quite dramatic, is the speed of system processes.
Why is it a battle?
To approach archiving in this manner a decision has to be made in relation to what the purpose of an ERP solution is. Is it a finance tool and as such subject to the statutory requirements? Or is it an operational tool and as such should be made as efficient and streamlined as possible?
As Finance, particularly in relation to the General Ledger, is primarily a reporting function their needs can be catered for within both the archived data and the data warehouse.
Where as Operations require an efficient ERP solution to drive day to day business and in essence create those reportable financials. For example, if it takes 8 hours to run a Trial balance report the business is not directly impacted. If it takes 8 hours to generate documentation for a loaded truck there's an impact upon delivery windows, driver schedules, customer performance metrics and all manner of other operational activities.
When it directly impacts the bottomline I know where my money is. So let's declare the ERP as an operational tool and execute on a rolling 15 months.
One final caveat, yes there are a number of exceptions to this 15 month rule and each case needs to be looked at individually. While I could start a list here I'm running out of space and feel sure they'll be pointed out to me.
Believes Support is more exciting, dynamic and much more interesting than Implementation. But doesn't understand why others disagree.
3 年There is usually a Dilbert for every occasion, especially in this industry. Now while I'm not recommending the Dilbert storage methods we do need to consider how often historical data is accessed and whether the Operations database needs to be bogged down by it.