Using Data Catalog And Data Lineage As a Saviour
SUBHADEEP PAL
Product Manager | FinTech, EdTech | Design Thinking | Agile Practitioner | Scrum Master | Member @ Toastmasters International
Being A Product Manager your role will be aligned towards a particular?Product Line (PL) and Sub Product Line (SPL)?of your organisation. For me it's Data Analytics and Business Intelligence. So based on my experience so far, in this article I will be talking about a couple of tools that are going to make your job easy if you are someone who works with data regularly.
With the exponential increase in volume and variety of data, coming from inside and outside of the?organisation the?data landscapes?are becoming complex day by day. Also since many organisations are moving into cloud based infrastructure which is driving many applications to be deployed as services that datas are getting more fragmented. And as the businesses are becoming data driven most of the business users are requesting for an easy way to consume data for their business needs. So in order to turn this data into valuable assets and simultaneously bring value to the operational and analytical system of the organisation as well to the consumers we need to create a?way to categorize and classify all the data automatically at scale. In addition to that it is also necessary to be aware about the life cycle of the data and the transformations that it undergoes from the source to the database. That is when the concept of?Data catalog?and?Data Lineage?comes into picture as they help you to track data easily in terms of schema, views and tables.?
So before we take a deep dive into the usefulness of data catalog and data lineage I will take a minute to discuss what they actually are on a high level. A Data Catalog is a collection of?metadata, combined with?data management and search tools, that helps the data users to find the data that they need, and simultaneously serves as an inventory of available data, and provides information to evaluate the data for intended uses. Whereas Data lineage presents the genesis of a dataset, how it adapts and evolves on its journey.?It describes a certain dataset’s origin, movement, characteristics and quality.
The Data Catalog and Data Lineage is helpful to an organisation in the following ways :
I guess after going through this short article you will certainly have some key takeaways.
Please do not forget to show you love by giving a like.
Would also like to have your valuable feedbacks through comments.
Regards
Subhadeep Pal
?
Principal AI Product Manager | PSPO?| Gen AI | LLMs | Prompt Engineering | Data Analytics | Cloud (SaaS, CPaaS, PaaS) | Growth Hacks | 0-1 & scaling 1-N | Intrapreneur | Mentor
3 年Focus a little more on Data catalog ontologies.. they're data frameworks for representing shareable and reusable knowledge accross product and domains. It will help you to build effective models for your sub product line..