The Strategic Value of Data Lineage in Identifying 
Data Risks

The Strategic Value of Data Lineage in Identifying Data Risks

First published on my website (The Strategic Value of Data Lineage in Identifying Your Data Risks — AHUJA CONSULTING LIMITED)

Beyond Technical Tracking: A Comprehensive Approach

For many, data lineage is merely an out-of-the-box capability in ETL tools that demonstrates the origin and transformations within data.

However, its potential extends far beyond this basic functionality.? A detailed lineage, incorporating the technical flows and the principal processes concerned with creating, manipulating and augmenting your data is a rich artefact that will enable you to:

1. Proactively identify key data quality risks

2. Assess and calibrate data control frameworks

3. Facilitate interactions with auditors and regulators

A Cautionary Tale: The London Whale

A stark example of the risks associated with inadequate lineage can be seen in the now infamous 2012 London Whale case.?

Due to an error in an Excel spreadsheet used to model risk, one financial institution seriously underestimated the downside of its synthetic credit portfolio, resulting in $6 billion in losses.

Had they documented a comprehensive business lineage which mapped the relevant operations and enabled an assessment of the risk of error??

One assumes not.

An Example of Comprehensive Lineage

Let’s take a look at what a robust lineage looks like.? See the table below for a sample partial flow from the insurance industry, culminating in the claim registration process:??


You can see how the breakdown of the key steps enables an identification of the risks which, in turn, allows for an assessment of the control environment in place.?

Key Components of a Robust Data Lineage

To build a comprehensive data lineage, you need to gather four critical pieces of information:

1. Point of origination of the data being tracked

2. Responsible business processes

3. Existing control environments

4. Technical data flow

#1 Data Origin

Understanding the golden source of data is fundamental. In our insurance example, risk registration data originates from an Underwriter Front Sheet, compiled from a broker's slip.?

You need to understand the source to enable an assessment of its validity.?

In this example, it’s evident that we have an immediate risk of the Front Sheet being miscoded.

#2 Business Process Understanding

Data is primarily created within business processes before being used and transformed downstream.? Therefore, a thorough understanding of these processes and how they interrelate is vital.

Again, you can see this in our example above.? The breakdown is focused on the processes responsible for the data; not just a technical flow.

#3 Controls

Situated within each business process should be a series of controls.? When mapping business processes, it’s crucial to document:

· Where the controls sit within it

· The scope of each control

· Responsible parties

?#4 Technical Lineage

You’ll need to understand how the data flows across systems, (horizontal lineage) as well as whether any transformations occurring (vertical lineage).?

Tracking both horizontal (system-to-system) and vertical (transformational) lineage helps identify critical data touchpoints and potential loss and corruption risks.

Implementation Considerations

The amount of time it takes to build a detailed lineage for even one flow should not be underestimated.?

Whilst well worth it, it’s going to require investment.?

Not all processes are well documented.

Neither are all controls.?

Your IT department may have poor documentation standards.

All of this will slow you down.?

But don’t let these factors stop you.? If anything, they should spur you on.?

Contrary to what we’re often told, when it comes to our data flows, it’s what we don’t know that will hurt us!

Strategic Recommendation

Despite challenges, developing a comprehensive data lineage is crucial.

Start by mapping high-level process flows, identifying key handoff points, and gradually building complexity.

The resulting artefact will become an invaluable tool for managing data quality risks and enhancing organisational control frameworks.

?

In the next article, which will be published in the New Year, we’ll focus on how to build a comprehensive data control framework.

?

Subscribe here to get future articles in this series.

--

Need Data Governance help??

Book a call here to discover how we can support you.


LOUIS HAUSLE

Sales Director - Launching MetaKarta - Data Catalog|Data Governance|Data Lineage

1 个月

Great insights, Navin! Understanding data lineage seems crucial for better risk assessment. What would you say is the biggest challenge organizations face when mapping out critical data flows?

回复

要查看或添加评论,请登录

Navin Ahuja的更多文章

社区洞察

其他会员也浏览了