Power BI Intermediate Level: 36 - Data Modelling and Table Relationships

Power BI Intermediate Level: 36 - Data Modelling and Table Relationships

Table of Contents?|?Power BI Report File?| Sample Input

Short summary: Rather than dealing with a single, wide table, you can normalize your data model by splitting split your data model into a “Fact” (Transaction) table and “Dimension” (Lookup) tables and setting up table relationships. This has many advantages, including enabling you to filter multiple fact tables simultaneously with one dimension table and build one visual with columns from different related tables.

So far, we have only dealt with single tables, which is great for getting started with Power BI. On a simple level, you can combine all of your data into one big table for data analysis, like we have done so far. For example, the following mockup sales data for a chain of three restaurants with five menu items is combined into a single, wide table. In data modelling terms, this is referred to as the denormalized form.

So far we have dealt with single, wide tables combining all relevant information.

Table Normalization

However, in many cases it is advisable to split the main table into a “Fact” (Transaction) table and “Dimension” (Lookup) tables and setting up table relationships, as indicated below. This is referred to as a normalized form.

You can normalize your data model by splitting your single main table into a “Fact” (Transaction) table and “Dimension” (Lookup) tables.

Benefits

Performing normalization gives us several key benefits as compared to dealing with a single, wide table:

  • If Dimension data needs to be updated, it can be updated in a single place, thus reducing error potential
  • We avoid a lot of duplicate data and may also improve performance
  • We can perform summarizations such as Count Rows using dimension tables without having to consider duplicates
  • We can use one Dimension table (such as a calendar table) to simultaneously filter multiple Fact tables
  • We can better separate the data and its ownership. We can more easily set access restrictions

Power BI’s data modeling engine is built to handle such relationships well. Note that we will do data modelling in the Data view whereas we did data ingestion and preprocessing in Power Query.

Unique Record Identifiers (IDs)

In order to facilitate relationships, we need to be able to uniquely identify each record (row) in our tables. One way would be to use names, such as the store names, but this is problematic because names can change in the future. We usually choose one of the following two approaches for creating unique identifiers (IDs):

  1. We can use s short number or a mixture of letters and numbers, similar to a license plate on a car, for example "RA-KL-8136". This is the most readable and simple to type while giving some information about the record.
  2. We can use Globally Unique Identifiers (GUIDs), which look like “ac5fb9dc-3562-48a2-857a-e84dc818f1be”. There are so many different GUIDs that a randomly generated GUID can practically considered globally unique compared to any other GUID created by anybody else. This is the safest in terms uniquely identifying records but it is difficult to read and type.

In our example, we just use short numbers as keys.

Performing Normalization

Ideally, the data would already be made available in a normalized form in a database by the data owner and you would only need to import it. But sometime you need to do this task yourself. If you decide to make up IDs, it is best to align with the data owner and change the data model in the source so that other people use the same IDs.

Assume that we start with one wide table contain all of the sales data, as in the screenshot at the top of the article. Let us now split this table into one transaction table and two dimension tables. I have done manually in Excel, replacing the store and menu texts with number keys and then building the dimension tables using those same keys. You could do the same in Power Query by adding index columns but it would be a bit more complicated.

This is the normalized input data with a transaction table and two dimension tables.

Setting Up Table Relationships

Let's load the three tables into Power BI. I renamed the tables accordingly with DIM_ and FACT_ prefixes, which is a commonly used practice to describe the roles of the tables. In the Model view in Power BI, arrange the DIM_Menu and DIM_Stores tables to be on top and connect the Menu Item ID columns and the Store ID columns from the dimension tables to the corresponding columns of the FACT_Transactions table. Alternatively, you could set the relationship under Manage relationships.

In the Model view, connect the ID columns to create table relationships

Let's visually examine the relationship arrows:

  • The arrows point from the dimension tables down to the fact table. This means that when you filter a dimension table in your report, the filter gets propagated down to the fact table. You can think of it like water flowing downwards but not upwards. For this analogy to work, you need to place the dimension tables above the fact table. You could change the filtering direction to bi-directional, so that your fact table could also filter up to the dimension tables, however this is only advisable in in rare cases.
  • Note as well the “1” and “*” at the ends of the relationship arrows. This denotes a one-to-many relationship, meaning that the items in the dimension table are unique, whereas these items can be referenced many times in the fact table. For example, there is only one store Cologne Central in table DIM_Stores but this store can be referenced in many transactions in the FACT_Transactions table.

In a dimension table, the key which uniquely identifies the row is called the Primary Key. In a fact table, a key which references a record in a dimension table and can be used in many rows is called a Foreign Key .

Visuals Using Table Relationships

Now that you have relationships between the table set up set, you can use columns from the different related tables in one visual. You can also use the fact and two dimension tables to display a Table visual like the original form. And of course we can do much more with this, as we will see in the following articles.

You can create visuals by using columns from different related tables.

Next up, we will start creating our own custom measures (formulas) for powerful, dynamic calculations.

Please like, share, and subscribe and feel free to ask questions in the comments below.

Next article:?Quick Measures for Simple Custom Formulas


要查看或添加评论,请登录

Richard Steinbiss的更多文章

社区洞察

其他会员也浏览了