What are the main building blocks of Data Warehouse models?
Drazen Orescanin
Enabler of digital privacy for global leaders and expert for privacy-enhancing technologies. Serial entrepreneur and founder at Legit (Data Privacy Manager), Poslovna Inteligencija and BIRD AI Incubator
In the previous and first blog in this series, we have tried to answer the question Why do we need Data Warehouse models? There was a lot of likes, shares and comments on that one and you can find the article here
Although I thought that in next blog in the series I will address benefits of implementation, I will focus on model architecture and continue the analogy that I have used in the first blog – how is building a house similar to building a successful Data Warehouse.
Long time ago (in 1998, to be precise) I went to SAS European User Group conference in Hague where Bill Inmon was a keynote speaker and that was the first time I have seen him presenting. He said at this lecture something very important that I have remembered: “You cannot build the house without foundations. I have never seen beautiful foundations - they are always a lot of mud, steel and concrete. But only on good foundations you can build good and beautiful house.”
The same logic applies for Data Warehouse – if you want to build good Data Warehouse, you need to have good foundations. In Data Warehouse context, foundations are the master data that business needs, that needs to be modeled right and cleansed. Master data include all the data about time periods, locations, product and service catalogs, involved parties and types of events and statuses.
After that, you will build the ground floor of your house where you spend most of your time doing daily business, or the base layer of your Data Warehouse. This is the detailed data that you will load from your source systems or external sources, integrate and cleanse to fit in your data model. This will include detailed data about your customers, their accounts and transactions, invoices and payments and financial data.
Finally, you will build your top floor where you will have your private area, and that is derived data layer in your Data Warehouse. In this area you are creating derived data marts or data sets that combine data from various parts of foundation and base layers, enabling you to do advanced analytics like profitability and risk calculations or predictive models about customer behavior, or calculate KPI's.
You can see sample organization of Data Warehouse model for a bank in following image:
When creating the model, you will use entities and relationships that will be represented in your Data Warehouse as tables and primary-foreign key relationships. You can think about entities as rooms where you can store your things, and relationships are like doors that are connecting those rooms. Every room has one or more doors that is connecting the room with other rooms and you will use those doors to move and communicate between rooms. Each room has its own functionality – in same ways entities are representing different business subjects.
Within the big house, rooms are used to support different functionalities, such as dining, living room, guest rooms or bedrooms. In a same way, entities and relationships are grouped in Subject Areas that are covering the requirements of specific business area – Sales, Inventory or Contact Center. In model image above, subject areas are represented as small boxes.
In next blog we will discuss more about model building blocks and modeling patterns that are used for model design. At some point later, we will also discuss the benefits of the implementation of standard industry Data Warehouse model.
Dra?en Ore??anin / DWH architect / more information at www.dwh-models.com
South Africa (SA): Demand Planning Executive at BAT
6 年Láudio Mufume check this.
Consultant, TCS
7 年Great article Drazen. What a nice and simple way you have articulated. Thanks for publishing this article.