The True Cost of an Excel Loader

The True Cost of an Excel Loader

Do you work in a regulated industry, with large volumes of source data, performing analyses in Excel? Do you join those analyses with other departments' data, for aggregating into group reporting results?

Perhaps you are being burned by broken links, voting to veto VBA, or otherwise making a migration from manual macro driven processes into a secure and scalable solution for ad-hoc business data management. You need to get your Excel data, into a database, fast.

You ask around.

Can You Integrate This Excel Into Our Group Reporting Flow?

Sure! You may hear. 5 minutes. Maybe less. There are 1000 quick fire ways to do this. In fact, our database has a button that lets you load data directly from Excel. It's easy - my cat could do it.

Except... In the corporate world... It's never easy. There are two main approaches to load Excel into a cloud database, or any data platform - the first is the quick-fire method above. Or should I say - quickly fired. Whilst fantastic and convenient for ad-hoc server-side analysis, such "uncontrolled" ingestion of business data into a Data Warehouse or Reporting Layer will tend to quickly spiral into a "data swamp" of undocumented, highly duplicated data that nobody dares to delete as nobody really knows where any of it came from, what it means, or what it might break. Being outside of IT governance, such reporting processes are typically classified as "End User Computing" with associated risks and additional audit costs.

The true cost of uncontrolled Excel uploads therefore include:

  • Audit & Compliance Reviews
  • Administration of business "sandbox" areas, or silos
  • Maintenance of manual yet business critical processes

You could say that this approach is "buy now, pay later". So what to do instead?

You need to get your Excel data, into a database, fast.

A Rock and a Hard Place

The second way could be described as the "Hard Place". It's the IT route. Yes - it's easy to load Excel into IT using a plethora of enterprise tools. The development part is trivial. The hard part, particularly in large organisations, is the governance and due process - submitting requirements, testing, promoting through multiple environments, representation at Change Advisory Boards, setting up SLAs, waiting on key stakeholders.

At the end of it you will have a 'hardened' ETL process that typically expects an Excel file to manually arrive on a (hopefully secured) shared network drive, on a specific date, with a specific filename, columns on a specific sheet in a specific order with correct data types. What could go wrong?

Of course, all problems can be addressed, and a range of scenarios dealt with, it's just... expensive. Not just from a delivery and deployment perspective, but also the opportunity cost - all that time spent convincing IT to load your spreadsheet, and then for them to actually do it, could have been spent delivering other strategic projects. Especially for simple spreadsheet uploads.

The result of this friction is that a large amount of business data will never make it into IT controlled databases or data lakes.

No alt text provided for this image

The Third Way

What if you didn't need to write code or jobs to match columns in a spreadsheet with a target table?

What if business users could drag & drop Excel into IT Controlled databases using only a browser?

And what if that same application would enforce an approval workflow to assign responsibility for data changes, and trigger additional jobs to run on data change events?

This is not the stuff of dreams. It's one of many features delivered by Data Controller for SAS?, which makes use of the powerful SAS/ACCESS engines to let business users load data safely into databases such as Postgres, Oracle, SQL Server, Teradata and Redshift, not to mention regular SAS datasets and CAS tables.

The basic flow looks like this:

Data Controller for SAS process flow

An Editor changes some data - either by loading a spreadsheet, a CSV, or making changes directly in the web interface. Assuming the validation checks pass, the data is transferred securely to a backend Staging Area.

One or more Approvers must approve the requested change, by reviewing the differences between the target table and the uploaded data (if 6000 rows are uploaded but only 6 are changed, only 6 rows are reviewed). Following approval, the database is updated, and any post approve jobs are triggered.

An Auditor can view the change history of any table, see who submitted and approved them (and any reason supplied), and even download a copy of the original Excel file.

Is it Secure?

Data Controller is significantly more secure than staging data with shared network drives, or letting end users update target tables with their own credentials.

Authentication is handled using SAS Logon - the same as any other SAS Web app, fully integrated with enterprise authentication providers such as LDAP.

Most importantly, separation is made between the owners of the data model (IT), and the owners of the data itself (the business).

We don't change the data. It comes from source systems.

If this is truly the case, that all your IT Controlled data comes 'untouched' from source systems, with no business inputs - then your reporting system is almost certainly based on Excel or a business owned database.

Even if you have somehow built a regulatory reporting system that requires no ad-hoc data, there may still be a need for adjustments to deal with data quality issues or manual overrides. The following video illustrates a potential business situation.

Our data is precious. Can we trust this?

Data Controller is battle hardened with dozens of deployments worldwide. Esoteric gotchas such as high precision decimals, encoding differences, short numerics, multiple metadata repositories and more have been captured and resolved. We have many layers of testing in the product itself, and the core components have been running on customer sites for over 10 years. We have official reviews from Der Touristik, Siemens Healthineers and Allianz Insurance.

What else does it do?

The following features are also built in:

  • Data Alerts. Get notified when your data has been modified.
  • Data Lineage. Trace the SAS 9 data lineage at column or table level, and export to SVG, PNG or CSV.
  • Data Dictionary. Describe your data artefacts in multiple dimensions.
  • Data Catalog. Track all your libraries, tables, columns and primary keys with a full table search to boot.

Most importantly, separation is made between the owners of the data model (IT), and the owners of the data itself (the business)

Summary

If you are:

  • Working in a regulated environment (Sarbanes Oxley, NBB, IFRS, SII, Basel, etc)
  • Collaborating with other departments to produce group reporting results
  • An existing SAS customer (SAS 9.3 or Viya 3.5 and above)

Then Data Controller is for you. Our plans start at 5 users - if you need less than that, there's nothing to pay. To arrange a demo, installation, or discussion - do get in touch!

Mary Grace C.

Military Retiree,Management Scientist, Public Speaking, SAS Data Scientist, University Lecturer,Catechist

2 年

well said. I left my employment with SAS 10 years ago and stepped out of the corporate working data analysis world 3 years ago - so I dont know the current state of affairs. but I do see Microsoft spreadsheets continuing as the primary tool when SAS and other data management tools offer so much better integrity and analysis capabilities

Ese Omatsone

Professional Engineer, Business/Data Analyst, Energy Researcher (Renewables), Storyboard Creator

3 年

Great insights, Allan. More to come, eh?

Scott Thomson

Meet next generation financial product management

4 年

Mort, I thought I knew all SAS PROCS , gents you are adopting legacy thinking, time to look at WPS

???Morton Geppert MBA FCIM

Unlock SAP in a SNAP | Find, Fix and Enrich ECC6 and S4/Hana | Master SAP data, Update SAP transactions, Automate SAP processes - Report, explore and validate instantly without code - Finance, Operations & Supply Chain.

4 年

I like option 4. Leave the data in the operational / ERP system like SAP as is and virtualise it in Excel only - If you have to export data from excel to a DB then something is very wrong - This way Excel is derisked, the users get the precision accuracy they need, data warehouse is not stuffed with pointless redundant data and SAS consultants can focus on Proc Genmod and Proc Inbreed where they belong. It takes a few hours to set up and requires no consultancy... What's not to like?

回复

要查看或添加评论,请登录

Allan B.的更多文章

  • Zen and the Art of Data Quality

    Zen and the Art of Data Quality

    What is “Quality”? In his 1974 book “Zen and the Art of Motorcycle Maintenance”, Robert Pirsig considers this a…

    10 条评论
  • 5 Tips for SAS App Developers

    5 Tips for SAS App Developers

    If you have SAS, you have a powerful platform for Enterprise App Development - one that runs extremely fast, connects…

    5 条评论
  • SAS London User Group

    SAS London User Group

    The launch party for the SAS London User Group happened last night, and what an event it was! In just 5 weeks we have…

    8 条评论
  • Easy AF SCL Modernisation with HTML5 and SAS

    Easy AF SCL Modernisation with HTML5 and SAS

    A long time ago in a galaxy far, far away - the only SAS available was Base SAS. Options to surface SAS in an…

    14 条评论
  • SAS UK Forum 2016

    SAS UK Forum 2016

    Another year, another UK SAS conference! 2016 marks a return to the land of Brum, a venue not used since 2007 and home…

    3 条评论
  • SAS Global Forum 2016

    SAS Global Forum 2016

    If SAS are known for doing one thing well (other than software), it has to be - throwing a good party! And this year…

    5 条评论

社区洞察

其他会员也浏览了