Dimensions in Datawarehouse
How many types of dimensions have in DATA WAREHOUSE?
At first, we need to talk about dimensions:
What is dimension:
Dimension tables contain descriptive information. Descriptive information is data such as the customer's name, job title, company name, and even geographic information about the customer's location.
Each dimension has a primary key called the Surrogate Key. The primary key in the source table is stored in the dimension table as a Business Key. So, the Business Key is the primary key in OLTP tables and the Surrogate Key is the Identity Column in a Dimension table.
Once we understand what the Dimension is, we need to get acquainted with its types.
Types of Dimensions:
1-Slowly Changing Dimensions
One of the most important things in business is the issue of time, so if we want to solve the problem of changing the data of dimension tables over time, The solution is SDC.
The purpose of this action is we must save the pre-change and post-change values. There are three common designs for SCDs.
Type 1: SCD data is replaced on previous data.
Type 2: By adding the start date and end date field to dimension tables to record the change history.
Type 3: In this case, the processing to track changes is somewhat simplified so that we only store the current record and the old record along with the date of the change in the table.
2- Rapidly Changing Dimensions
This type happened when if one or more attributes of a table change too fast in many rows. Managing this type of dimension in a database is extremely difficult.
One of the solutions to manage Rapidly Changing Dimension is:
Move attributes that are constantly changing to another separate dimension table, store them there with a separated foreign key link with the fact table.
3- Junk Dimensions
A junk dimension is a separated table with a combination of unrelated attributes with low cardinality.
This type of dimension is used to reduce the number of dimensions in our model and also to remove the number of foreign keys from the Fact table.
领英推荐
4- Inferred Dimension
I call this kind of dimension the sudden dimension.
First, I have to say If your data source is very well designed, you should never encounter this scenario. When the fact data is available in the source but related dimension data does not yet exist the Inferred dimension is used this means loading fact records which a dimension record may not yet be ready.
For example:
Imagine in our business we show the public holidays of the year but one day something important happens and is added to the public holidays in our calendar, When the ETL would process the data, it will find that the new holiday day is not available.
5- Conformed Dimensions
When one dimension is used in more than one business process, we called it Conformed Dimensions.
For example, we know that in all businesses, time and date are one of the most important tables of any business. This dimension may be used with multiple fact tables in one database or multiple data marts and data warehouses.
6- Degenerate Dimensions
A degenerate dimension is when the dimension attribute is not stored in a separate table and Insert data directly into Fact. we usually called Degenerate dimensions fact dimensions.
This is because when we have a very large fact table, useful dimensional data is sometimes stored in a fact table to reduce duplication.
7-Role Playing Dimensions
We called this dimension “dimension with a multiple meaning”, which means This dimension appears aliased several times in the same fact table with unique labels. in Datawarehouse structure we can create independent views for each fact table foreign key.
For example, the Location dimension is a Role-Playing Dimension. We can imagine we have a fact table which has Customer Address and Shipping Address. We can create two aliases of the location dimension (two roles) in the semantic layer of the reporting/OLAP tool and name them as Order location dimension, Shipment location dimension.
8- Shrunken Dimensions
shrunken dimension in Kimball's dimensional modelling is a subset of a dimension’s attributes from the base dimension. For example, a day dimension would be a shrunken dimension of the Date dimension. The day dimension could be connected to a fact table whose grain is at the daily level.
9- Static Dimensions
There are some dimensions we used to configure manually not from a real data source system that is called Time Dimensions and Locations that is got GRE dimensions and will be inserted into the database only once. This type of dimension will be static dimensions because this dimension data we are not registering from source and we are not receiving from ETL loads.
CDO at Farabi Brokerage Co.
3 年????
Development Manager | Digital Analyst | IT Project Manager | Scrum Alliance-CSM?
3 年very useful, thank you Ehsan????