The Data Management Framework
The foundation for any successful D&A program is the Data Management Framework (DMF) which describes the Company’s operating model for the effective management of data assets. A comprehensive multi-dimensional Data Management Framework outlines the relationship between stakeholders, people, process and technology to govern the delivery of enterprise-wide data capabilities to support customer privacy and consent, consistent and trusted insight development, and evidence-based decision making. Portions of the DMF approach has been defined, refined and employed in more than 40 programs/engagements across multiple industries and organizational sizes over the past 20+ years.
Each dimension of the DMF defines and describes the essential ingredients (both the what and the how) behind the D&A program including the target state, the operating model and business engagement.
Part One of the DMF <https://www.evanta.com/resources/cdo/peer-practices/the-foundation-of-a-robust-data---analytics-program---part-one> introduced three of the four DMF core-dimensions; stakeholders, people, and technology. The details underpinning each core-dimension make up the ingredients of a proven recipe for data program success extending beyond project hierarchical structures, roles, responsibilities and tools. For those in earlier stages of their D&A program journey, reviewing the first article in the series will help to identify gaps, and develop and evolve new strategies.
This second article in the series outlines the approach and components for the last dimension; Process.
Process - What components are needed to manage an organizations data?
My prior work has identified six essential Process components required to construct a viable D&A program.
The first Process component is critical data elements (CDEs). The amount of data a business generates is growing at an exponential rate and not all data is of equal importance. It is imperative to focus data management efforts on the data that has the greatest impact and value to the business. Therefore, critical data is of utmost importance to management decision making, operational strategic planning and risk management. Critical data can be identified through a review of key business processes, reports and analytics. Critical data typically consists of a business concept, related business terms and a business description.
While a lot of critical data will.be composed of many physical data elements (mapped from multiple sources and integrated at the critical data or business term level in a many-to-one relationship) and those elements likely exist in many places, often only one physical data element in one location will be identified as a CDE. The Authoritative Source for each CDE is trusted as a true representation of the data for its intended purpose, where the CDE is:
·????????As close as possible to its point of origin or creation.
·????????Not significantly altered from its point of origin.
·????????Considered to be highly reliable and/or authentic.
The second Process component is metadata. Metadata (i.e., data about data) is at the center of all successful D&A programs. This includes business, technical, and operational metadata. The Business occasionally does not fully appreciate what metadata represents, what it actually is, and its importance in understanding the data that the enterprise generates and owns. The following story helps explain metadata in simple, relatable terms.
Visualize a massive container ship traveling across the ocean with thousands of shipping containers onboard. Each container represents a physical data element (typically from operational source systems). If you wanted information about what was in a specific container, you would need to open it, look inside and describe what you see. Your unique description of the contents in simple business terms is your business metadata.
领英推è
Then, you close the lid of the container and examine the physical container itself. You measure the physical dimensions; its height, its width, its depth, other relevant facts (i.e., Is it a refrigerated container? Is it for hazardous materials? etc.). The physical description of the container’s characteristics is your technical metadata.
Finally, when the ship docks, the containers are off loaded. Capturing the details of the movement of each container (e.g., time, date, new conveyance, etc.), is your operational metadata. This describes the capture of facts about data movement between one point to another and beyond.
The third Process component is data quality. In its most effective form, data quality goes hand in hand with its corresponding metadata. One aspect of data quality (DQ) involves the profiling, measurement and assessment of the data to fulfill its intended purpose in business operations. DQ results depicted across multiple dimensions of DQ are most useful when business users can readily understand the rules that have been applied against a given term, illustrated alongside its metadata. It is important to note that business users and data scientists have differing interpretations of what DQ profiling and measurement represents (i.e., DQ means different things to different people depending entirely on one’s perspective).
For business users, DQ profiling and measurement is most relevant when applied at or as close as possible to the point of data origination or creation. While many resort to measuring DQ at the point of consumption (for a variety of constraints/reasons), this is a fundamentally flawed approach as there are often conflicting end-user requirements and interpretations based on planned data usage. Also, resulting remediations are often developed in a vacuum by numerous people in a one-off manner. A DQ perspective at the point of consumption should solely be focused on fitness-for-purpose based on what is measured at the point of data origination or creation.
The fourth Process component is master and reference data management. Master and Reference Data Management (MDM) is the control process by which master data is created and maintained as a system of record for the enterprise. ?While many organizations have mastered customer, product, security, organizational data or some combination thereof; it remains an unfulfilled gap for many. It can be preferable or expedient to emulate and enable aspects of master and reference data as you go to avoid large-scale invasive mega-projects.
The fifth Process component covers data classification, tokenization & role-based access control (RBAC). Classifying data is non-trivial and often tied to an enterprise’s information security approach and policies. Many organizations claim to have meticulously classified their data, but this is a very taxing and detailed undertaking often resulting in classification at the operational systems level. A key characteristic is to appropriately classify each and every data element to ensure that security measures are maintained, and access is granted accordingly.
Dr. Ann Cavoukian, one-time Information & Privacy Commissioner, Province of Ontario, advances the view that the future of privacy cannot be assured solely by compliance with regulatory frameworks; rather, privacy assurance must ideally become an organization’s default mode of operation. Tokenization enables the realization of Cavoukian’s “privacy by designâ€. The execution of a privacy by design model subjects Personally Identifiable Information (PII) to ‘persistent’ tokenization (a reversible masking technique with encryption) prior to providing any individual with access to this data. Pre-approved and managed de-tokenization is enabled as an exception rather than the rule.
The final Process component is data issue management and remediation. It is necessary to define a formalized process about how to identify, triage and manage data issues. Remediating the issues at the source as the preferred approach. Governance councils must mature their focus from oversight to prioritizing treatment of data issues and regularly monitoring the progress of their resolution. Rosen recommends prioritizing the data issues on a scale of high, medium and low impact to increase the visibility across all impacted data consumers.
Building a Data Strategy
As described the data management framework is the critical underpinning of the entire D&A program. While some organizations embed portions of the DMF within their D&A strategy, a D&A strategy is not the same as a DMF. A DMF is realized incrementally via a D&A strategy and roadmap that lay out the program to incrementally build out the target state. The D&A strategy describes the priorities, guiding principles, actions and the roadmap (order of projects/initiatives) you're pursuing to reach the target state. The DMF defines what you have accomplished when you have delivered the D&A strategy. It represents both the what and the how of your D&A program. ?
The D&A strategy needs to be built in concert with the framework. To help the business reach their target state and to ensure alignment across the organization, it is important for the D&A strategy to define the strategic themes of the organization’s D&A program.
A future post will discuss the method of applying a thematic approach to build out a comprehensive D&A strategy and roadmap for your organization in alignment with the business strategy.
About:
Cal Rosen, vice president of data and analytics at Home Trust in Toronto, Canada began his data and analytics journey more than 25 years ago. Starting with defining and building data warehouses and prototypes for the Telco industry in the US and Canada with Teradata Industry Consulting, he has since led data and analytics consulting practices for PwC and Cap Gemini Ernst & Young in addition to starting and successfully running his own consulting business - ActionInfo Consulting. Rosen has successfully delivered projects in multiple industries across North America (e.g. Communications, Retail, Financial & Insurance, Energy & Utilities, Transportation, Healthcare, Natural Resources, Gaming and Public Sector). Recently, Rosen has held executive leadership roles in the nascent Data Office’s for two of Canada’s largest International Banks. As a data and analytics thought leader, passionate evangelist, and author, Rosen is a sought after speaker and panelist having delivered sessions in well over 30 data and analytics conferences.
EVP, FinServ | Emerging/Converging Markets across Accounting, Banking, Finance, Insurance, Investment, Real Estate, & Technology
3 å¹´Cal, thanks for sharing!
Founder @ Maple Peak Group Inc. ??| Canadian & U.S. Advocate @ EDM Council | Board Member @ Canadian RegTech Assoc. | PE Investor | Dad to 3 Daughters
3 å¹´... on my reading list for the weekend, enjoy yours too Cal Rosen