Power BI Intermediate Level: 36 - Data Modelling and Table Relationships
Richard Steinbiss
Head of Business Intelligence and Data Analytics (Power BI, AWS, Azure, Spark, Python, Kubernetes)
Short summary: Rather than dealing with a single, wide table, you can normalize your data model by splitting split your data model into a “Fact” (Transaction) table and “Dimension” (Lookup) tables and setting up table relationships. This has many advantages, including enabling you to filter multiple fact tables simultaneously with one dimension table and build one visual with columns from different related tables.
So far, we have only dealt with single tables, which is great for getting started with Power BI. On a simple level, you can combine all of your data into one big table for data analysis, like we have done so far. For example, the following mockup sales data for a chain of three restaurants with five menu items is combined into a single, wide table. In data modelling terms, this is referred to as the denormalized form.
Table Normalization
However, in many cases it is advisable to split the main table into a “Fact” (Transaction) table and “Dimension” (Lookup) tables and setting up table relationships, as indicated below. This is referred to as a normalized form.
Benefits
Performing normalization gives us several key benefits as compared to dealing with a single, wide table:
Power BI’s data modeling engine is built to handle such relationships well. Note that we will do data modelling in the Data view whereas we did data ingestion and preprocessing in Power Query.
Unique Record Identifiers (IDs)
In order to facilitate relationships, we need to be able to uniquely identify each record (row) in our tables. One way would be to use names, such as the store names, but this is problematic because names can change in the future. We usually choose one of the following two approaches for creating unique identifiers (IDs):
In our example, we just use short numbers as keys.
领英推荐
Performing Normalization
Ideally, the data would already be made available in a normalized form in a database by the data owner and you would only need to import it. But sometime you need to do this task yourself. If you decide to make up IDs, it is best to align with the data owner and change the data model in the source so that other people use the same IDs.
Assume that we start with one wide table contain all of the sales data, as in the screenshot at the top of the article. Let us now split this table into one transaction table and two dimension tables. I have done manually in Excel, replacing the store and menu texts with number keys and then building the dimension tables using those same keys. You could do the same in Power Query by adding index columns but it would be a bit more complicated.
Setting Up Table Relationships
Let's load the three tables into Power BI. I renamed the tables accordingly with DIM_ and FACT_ prefixes, which is a commonly used practice to describe the roles of the tables. In the Model view in Power BI, arrange the DIM_Menu and DIM_Stores tables to be on top and connect the Menu Item ID columns and the Store ID columns from the dimension tables to the corresponding columns of the FACT_Transactions table. Alternatively, you could set the relationship under Manage relationships.
Let's visually examine the relationship arrows:
In a dimension table, the key which uniquely identifies the row is called the Primary Key. In a fact table, a key which references a record in a dimension table and can be used in many rows is called a Foreign Key .
Visuals Using Table Relationships
Now that you have relationships between the table set up set, you can use columns from the different related tables in one visual. You can also use the fact and two dimension tables to display a Table visual like the original form. And of course we can do much more with this, as we will see in the following articles.
Next up, we will start creating our own custom measures (formulas) for powerful, dynamic calculations.
Please like, share, and subscribe and feel free to ask questions in the comments below.