Canonical Data Model Data Dictionary Mapping
Hanabal Khaing
Senior Enterprise Data Modeler, Data Gov, CDO, CTO, Law & BRD to Multidimensional legal compliance real-time anti-fraud ERP Data Model, 60 SKILLSETS, $53,000,000+ annual, Bank/F500/Gov/AI Data fix $4 BILLION Min Deposit
The purpose of a canonical model is to provide inter-operability between multiple systems at the data level. Once the model is created, it will need a mapping file or data dictionary for multiple systems which show how columns for data in one database maps to columns in another database. An ETL process then uses the mapping to transform and extract data from one system to the canonical model. At that point another system can extract the data in a different format based on its own requirements. Additionally BI tools can be connected to a canonical model or to its data marts for real-time analysis.
The data dictionary is a set of multi-dimensional tables that capture metadata for all data models and databases that must use the canonical model. It can also capture how the columns are mapped between databases. Many ETL tools, such as Informatica and IBM DataStage, use XML instead of multi-dimensional star schema to capture the mapping data, but the final effect is the same. Once the metadata is captured, multiple mapping files or mapping records can be created to provide automated ETL through the canonical model. Additionally, data can be exported to and from the canonical model using JSON, SQL, stored procedures, Java, and other languages and tools. However, each ETL method must have a set of unit tests to ensure data integrity.