Metadata-Driven (MDD) Framework Meets Data Warehouse Automation
Connect, Collaborate & Engage !!

Metadata-Driven (MDD) Framework Meets Data Warehouse Automation

Data has been the fastest growing currency of the future that outrivals the value of the likes of oil and gold in today’s insight-driven world. To maximize its utility and drive powerful business decisions, you have data warehouses that sit at the centre of most organizations’ analytics strategy. Speed and accuracy of information are of the essence when you need to make game-changing decisions in the competitive landscape.

Since data warehouses have a lot more moving parts, it makes sense to have automated processes in place that enable IT teams to deliver actionable insights at the speed of business, considering its overall wait period to get the platform ready. And this is where the concept of configuration, automation comes in. All the modern tech-stacks endorse automation mindset shift. With digital transformation, it is imperative to ensure a faster data-to-value journey, giving more room for innovation, and makes work more purpose-oriented and enjoyable for your IT team.

After?data modeling, perhaps the most time-consuming part is writing ETL/ELT code for populating your data warehouse. Automation introduces developers to the zero-code or low-code(configurable), where they work at a logical level (design level) to create the integration flows. This means that IT teams no longer have to fight SQLs and can get the data from source systems to the destination warehouse in hours or days, to bring in value for business.

Metadata in digital systems is abundant and permeating. The role of the records and information management is to identify what metadata in business applications, systems and cloud environments is necessary for the creation, capture and management of authoritative records and information used for operational as well as orchestration activities. We did implement the metadata driven architecture(Operational & technical Metadata) for domains like Banking, Security, Retail, Logistics, Life Sciences & eventually brought to Insurance.

In a data warehouse, metadata can be many things, like data types, data formats, source and destination database tables, entity relationships, SCD patterns, and ETL mappings and transformations, and more. As such, a?metadata-driven architecture?allows you to bring source database schema into a data model, customize its structure based on your business requirements, and make the data model available for subsequent processes, such as data analytics. When the metadata-driven approach is coupled with automation, they become the perfect partners that streamline design, development, and deployment, leading to a robust data warehouse implementation. Such combination provides IT teams with everything they need to formulate agile and sustainable processes that help deliver high-quality outputs consistently. This becomes handy in data drifts as well.

In order to be authoritative, metadata should possess:

  • a description of the content of records and information
  • the structure of records and information
  • the business context in which records and information were created or received and used
  • relationships with other records, information and metadata
  • business actions and events
  • information that may be needed to retrieve and present records and information

The effective implementation of metadata is based on the following principles we learnt while developing configurable design/architecture for data management platforms on-premise or on-cloud:

  • Metadata requirements should be considered as part of process: identify what metadata is necessary for the creation, capture and management of orchestration of data management jobs, and what metadata supports organisational auditing and business lineage.
  • Metadata is scalable: determine which levels of metadata can best meet its various business needs & to what level of depth.
  • Metadata should be described, stored and managed: use metadata schemas and encoding schemes to promote the entry of meaningful, standardised and consistent metadata.
  • Metadata is dynamic and grows over time: be aware that records and information will continue to accrue metadata throughout their existence.
  • Metadata should be persistently linked with records and information, so as to generate reports on the top of: ensure that metadata is linked with the records and information to which it relates when they are transferred out of their original creating environment and through subsequent migrations.
  • Metadata should be managed as a record: document how it has configured and applied the metadata in its systems.

The idea in this blog is divided into two categories.??Principles?are those concepts judged to be common to all domains of metadata and which might inform the design of any metadata schema or application.??Practicalities?are the rules of thumb, constraints, and infrastructure issues that emerge from bringing theory into practice in the form of useful and sustainable systems.?

Principles:

A. Modularity : Metadata modularity is a key organizing principle for environments characterized by vastly diverse sources of content, styles of content management, and approaches to resource description. It allows designers of metadata schemas to create new assemblies based on established metadata schemas and benefit from observed best practice, rather than reinventing elements anew.

B. Extensibility : Metadata systems must allow for extensions so that particular needs of a given application can be accommodated. Some metadata elements are likely to be found in most metadata schemas (the concept of?creator?or?identifier?of an information resource, for example). Others will be specific to particular applications or domains (degree of cloud cover,?for example, in remote sensing data).

C. Refinement : Application domains will differ according to the degree of detail that is necessary or desirable. The design of metadata standards should allow schema designers to choose a level of detail appropriate to a given application. Populating databases with metadata is costly, so there are strong economic incentives to create metadata with sufficient detail to meet the functional requirements of an application, but not more.?

Practicalities:

A. Application Profiles: No single metadata element set will accommodate the functional requirements of all applications, and it becomes increasingly important to be able to also cross discovery boundaries. Application profiles will facilitate this by allowing designers to 'mix and match' schemas as appropriate. Application Profiles achieve this modularity through Cardinality enforcement:?Cardinality refers to constraints on the appearance of an element. Is it optional? Mandatory? Conditional??

B. Syntax and Semantics: Semantics is about meaning; syntax is about form. Agreements about both are necessary for two development communities or different departments or LoBs to share metadata. Two communities may agree about the meaning of the term title or creator or identifier, but until they have a shared convention for identifying and encoding values, they cannot easily exchange their metadata. This will help to standardise the data across the organization & saves lot of integration challenges.?

Kudos to all the data & quality management team members - Swati, Nainish, Joydeep, Thanga, Aarti, Dhiraj, Tanya, Yogita, Sadhana, Aaditya, Vaishali, Lalchand, Harshal, Kalai, Madhuri, Madhura, Pooja, Shubhada, Vini & all. Thank you for making it a memorable journey.

Conclusion:

Saama MDD(metadata driven) framework simplifies and automates data warehouse development end-to-end, using the agile metadata-driven approach. The product fetches metadata directly from source databases and allows you to utilize it in the design, development, and deployment phases of your data warehouse. Once implemented, introducing changes to the design is easy as the captured metadata allows you to propagate changes across the board while ensuring the integrity of existing models, integration flows, and deployments.

Want to see the power of the metadata-driven approach and how these two technologies in action together? Reach out [email protected] for more details.

Vaibhav Karpe

Talent Acquisition Manager | Project Owner-INDIA R&D & Services | Dassault Systèmes & Medidata | Technology Enthusiast | Creative & Innovative | Mad over data & Technology Exploring opportunities? Message me??

2 年

Kudos to whole team ??

Prajakta Borkar

Solutions Architect at Snowflake - The Data Cloud

2 年

Awesome KD ! Data insights is real focus to enable more data points which was challenging in previous life :) but now it is at ease of cloud platform and reusablity feature.

要查看或添加评论,请登录

Koustubh Dhopade的更多文章

社区洞察

其他会员也浏览了