Episode 6: Part 2 - Documenting Data Lineage
Oyinlola Oresanya
Senior Data Governance Consultant @ Devoteam | CDMP, TOGAF, PMP, CBAP, Google Cloud Digital Leader
The workshop took longer than expected and we had to continue the following day.
Chike introduced me to the Monthly Operational Performance Report, an important dashboard used by senior management to monitor key metrics like transaction volume, branch efficiency, and customer satisfaction.
“Our job,” Chike said, “is to document the lineage of this report. We need to show how the data flows from its source systems, through processing steps, to its final presentation in the dashboard.”
Step 1: Identifying Data Sources
We began by reviewing the data sources feeding into the report:
Each source was linked to metadata that described its structure, owner, and update frequency.
Step 2: Mapping Data Transformations
Next, we looked at how the data was processed before reaching the report:
Chike demonstrated how to use the data catalog tool to map these steps.
“Each transformation adds context to the lineage,” he explained. “If there’s an issue in the report, this map helps us trace it back to the source and fix it.” He added these details to the data catalog, noting each step in the process and linking it to the relevant metadata.
Step 3: Documenting Metadata
With the lineage mapped, we documented metadata for each stage:
Descriptive Metadata
领英推荐
Provenance Metadata
Technical Metadata
Administrative Metadata
Step 4: Validating the Metadata
Once we’d mapped the lineage, Chike showed me how to validate the metadata:
“This is where metadata governance comes in,” Chike said. “Good metadata needs to be understandable and usable by everyone.”
Step 5: Validating the Lineage
After completing the documentation, we reviewed the lineage with key stakeholders:
Their feedback helped refine the lineage and fill in missing details.
As we wrapped up, Chike asked me what I’d learned from the task.
“Data lineage is like telling a story,” I said. “It shows where the data started, what happened to it, and how it ended up in the report. Without this, we’d be working blind.”
Chike nodded. “Exactly. Metadata—and lineage in particular—is what makes our data governance work. It gives us traceability, accountability, and trust.”
That evening, I felt a sense of accomplishment. This task has shown me metadata’s power to bring clarity and order to the rather complex world of data governance.
Data Governance Analyst at Canopius INSURANCE
3 周Insightful