Dimensions in Datawarehouse

Dimensions in Datawarehouse

How many types of dimensions have in DATA WAREHOUSE?

At first, we need to talk about dimensions:

What is dimension:

Dimension tables contain descriptive information. Descriptive information is data such as the customer's name, job title, company name, and even geographic information about the customer's location.

Each dimension has a primary key called the Surrogate Key. The primary key in the source table is stored in the dimension table as a Business Key. So, the Business Key is the primary key in OLTP tables and the Surrogate Key is the Identity Column in a Dimension table.

Once we understand what the Dimension is, we need to get acquainted with its types.

Types of Dimensions:

  1. Slowly Changing Dimensions
  2. Rapidly Changing Dimensions
  3. Junk Dimensions
  4. Inferred Dimensions
  5. Conformed Dimensions
  6. Degenerate Dimensions
  7. Role-Playing Dimensions
  8. Shrunken Dimensions
  9. Static Dimensions

1-Slowly Changing Dimensions

One of the most important things in business is the issue of time, so if we want to solve the problem of changing the data of dimension tables over time, The solution is SDC.

The purpose of this action is we must save the pre-change and post-change values. There are three common designs for SCDs.

Type 1: SCD data is replaced on previous data.

Type 2: By adding the start date and end date field to dimension tables to record the change history.

Type 3: In this case, the processing to track changes is somewhat simplified so that we only store the current record and the old record along with the date of the change in the table.

2- Rapidly Changing Dimensions

This type happened when if one or more attributes of a table change too fast in many rows. Managing this type of dimension in a database is extremely difficult.

One of the solutions to manage Rapidly Changing Dimension is:

Move attributes that are constantly changing to another separate dimension table, store them there with a separated foreign key link with the fact table.

3- Junk Dimensions

A junk dimension is a separated table with a combination of unrelated attributes with low cardinality.

This type of dimension is used to reduce the number of dimensions in our model and also to remove the number of foreign keys from the Fact table.

4- Inferred Dimension

I call this kind of dimension the sudden dimension.

First, I have to say If your data source is very well designed, you should never encounter this scenario. When the fact data is available in the source but related dimension data does not yet exist the Inferred dimension is used this means loading fact records which a dimension record may not yet be ready.

For example:

Imagine in our business we show the public holidays of the year but one day something important happens and is added to the public holidays in our calendar, When the ETL would process the data, it will find that the new holiday day is not available.

5- Conformed Dimensions

When one dimension is used in more than one business process, we called it Conformed Dimensions.

For example, we know that in all businesses, time and date are one of the most important tables of any business. This dimension may be used with multiple fact tables in one database or multiple data marts and data warehouses.

Conformed Dimensions

6- Degenerate Dimensions

A degenerate dimension is when the dimension attribute is not stored in a separate table and Insert data directly into Fact. we usually called Degenerate dimensions fact dimensions.

This is because when we have a very large fact table, useful dimensional data is sometimes stored in a fact table to reduce duplication.

7-Role Playing Dimensions

We called this dimension “dimension with a multiple meaning”, which means This dimension appears aliased several times in the same fact table with unique labels. in Datawarehouse structure we can create independent views for each fact table foreign key.

For example, the Location dimension is a Role-Playing Dimension. We can imagine we have a fact table which has Customer Address and Shipping Address. We can create two aliases of the location dimension (two roles) in the semantic layer of the reporting/OLAP tool and name them as Order location dimension, Shipment location dimension.

Role Playing Dimensions

8- Shrunken Dimensions

shrunken dimension in Kimball's dimensional modelling is a subset of a dimension’s attributes from the base dimension. For example, a day dimension would be a shrunken dimension of the Date dimension. The day dimension could be connected to a fact table whose grain is at the daily level.

9- Static Dimensions

There are some dimensions we used to configure manually not from a real data source system that is called Time Dimensions and Locations that is got GRE dimensions and will be inserted into the database only once. This type of dimension will be static dimensions because this dimension data we are not registering from source and we are not receiving from ETL loads.




Borna Najafi

CDO at Farabi Brokerage Co.

3 年

????

回复
Matin Beheshti

Development Manager | Digital Analyst | IT Project Manager | Scrum Alliance-CSM?

3 年

very useful, thank you Ehsan????

回复

要查看或添加评论,请登录

Ehsan Hemati的更多文章

  • Airflow Task - Part 3

    Airflow Task - Part 3

    In this article, I'll discuss passing arguments to DAGs and Python operator tasks. So first we'll look at some common…

    3 条评论
  • Apache Airflow for Data Engineering pipeline Part 2 - Airflow Workflow

    Apache Airflow for Data Engineering pipeline Part 2 - Airflow Workflow

    In the pre-article, a basic explanation of what airflow is and how it works. I also explained about the dashboard.

    2 条评论
  • Apache Airflow for Data Engineering pipeline Part 1

    Apache Airflow for Data Engineering pipeline Part 1

    To build any big data product, you need data to work with and just like pipelines allow us to gather and transport oil…

    4 条评论

社区洞察

其他会员也浏览了