登录查看更多内容

How to make Power BI Direct-Query Fast & Optimized

Sujeet Singh

Delivering Data Analytics, Gen AI, Business Intelligence and Cloud Solutions | Microsoft Certified Azure Data Architect and Expert - Data Management

发布日期: 2025年3月10日

Direct query reports in Power BI tend to be slower compared to import mode. Performing complex data aggregations or extensive slicing and dicing can further slowdown these reports, potentially frustrating end users. Microsoft has enabled an advanced setting for direct-query reports or composite data models - "Automatic Aggregations".

Automatic aggregations leverage advanced machine learning (ML) to enhance the performance of Direct-Query semantic models by continuously optimizing them. Built on the foundation of the user-defined aggregations framework introduced with composite models in Power BI, automatic aggregations simplify the process by removing the need for extensive data modeling and query optimization expertise. These aggregations are self-training and self-optimizing, allowing model owners—regardless of their skill level—to improve query performance and deliver faster report visualizations for large data models.

Benefits of Automatic Aggregations:

Faster Reports: Optimally caches queries in-memory, reducing reliance on backend data sources for improved performance.
Balanced Architecture: Minimizes query load on data sources during peak times, enhancing scalability.
Easy Setup: Enables automatic training and refresh schedules, with self-optimizing performance over time.
Fine-tuning: Provides a simple slider control in model settings to adjust cache usage and maximize performance gains.

Supported plans

Automatic aggregations are supported for Power BI Premium per capacity, Premium per user, and Power BI Embedded models.

Permissions

To enable and configure automatic aggregations, you must be the Model owner. Workspace admins can take over as owner to configure automatic aggregations settings.

With DirectQuery, whenever a user opens or interacts with a report, DAX queries are sent to the query engine, which then forwards them as SQL queries to the backend data source. The data source must then process these queries and return the results. Compared to import mode models that store data in memory, this round-trip process in DirectQuery can be both time-consuming and resource-intensive, often leading to slower report performance.

Enabling automatic aggregations in a DirectQuery model can significantly improve performance by reducing the need for constant data source queries. Instead of relying solely on the backend, automatic aggregations use an in-memory cache that stores pre-aggregated results. This cache holds only a small portion of the data from fact and detail tables but still manages to deliver faster query responses and reduce the load on the data source. For queries that aren’t covered by the cache, the system automatically falls back to querying the backend, similar to traditional DirectQuery behavior.

Training operations

During the first scheduled refresh of the day or week (based on your chosen frequency), Power BI runs a training operation to analyze query patterns and adjust the in-memory aggregation cache accordingly. This process ensures the cache is optimized as query patterns change.

In this step, Power BI may create, update, or remove aggregation tables, and it also queries the data source to decide what data should be included in the cache. However, the actual aggregation data isn’t loaded during training — it gets loaded in the next refresh.

For example, if your refresh schedule is set for 4:00 AM, 9:00 AM, 2:00 PM, and 7:00 PM, only the 4:00 AM refresh will include both training and data refresh. The remaining refreshes for the day will only update the cache without retraining.

During training, Power BI analyzes past queries from the query log to predict and prepare for future queries. While this method is generally accurate, it doesn’t guarantee that all future queries will be covered by the in-memory cache, especially if they differ from past patterns. Queries not covered by the cache are sent directly to the data source using DirectQuery. However, if these new queries are frequent or important, they may be added to the cache during the next training operation.

The training process has a 60-minute time limit. If it can’t finish analyzing the entire query log within this time, a notification appears in the model’s Refresh history, and training will resume in the next scheduled session. Once the full query log is processed, the new aggregation data replaces the previous

Caution

Training and refresh operations can be demanding on both the Power BI service and data source systems. Increasing the percentage of queries that use aggregations requires more data to be queried and calculated during these processes. This can lead to higher resource consumption and may increase the risk of timeouts.

There is more detailed information provided by Microsoft and step by step guide through a whitepaper. Please do refer the Power BI technical whitepapers and documents for deep dive details.

#Microsoft #PowerBI #MicrosoftLearn #GenerativeAI #Snowflake #AIApplications #LLMs (Large Language Models) #GenerativeAI #MachineLearning #ArtificialIntelligence #DataAnalytics #TechInnovation #DataScience #TechTrends #CloudComputing #EnterpriseAI #OpenAI #Microsoft #Copilot #GoogleAI #Google #NVIDA

要查看或添加评论，请登录

Sujeet Singh的更多文章

Introducing Snowflake Cortex AI: Build generative AI applications with fully managed LLMs and chat with your data services

2024年7月10日

Introducing Snowflake Cortex AI: Build generative AI applications with fully managed LLMs and chat with your data services

Generative AI is rapidly becoming integral across various sectors and industries worldwide. As organizations and their…
Introducing Databricks Lake-Flow: A No-Code, Next-Generation Intelligent Solution for Data Engineering

2024年6月27日

Introducing Databricks Lake-Flow: A No-Code, Next-Generation Intelligent Solution for Data Engineering

The Databricks team has announced LakeFlow, a unified data platform poised to propel Databricks ahead of its rivals…

1 条评论
Unleashing AI Core: A Beginner's Guide to Generative AI and Essential Concepts of AI Ecosystem

2023年7月13日

Unleashing AI Core: A Beginner's Guide to Generative AI and Essential Concepts of AI Ecosystem

In today's rapidly evolving world of technology, artificial intelligence (AI) has emerged as a powerful force…
Managing Tempdb-Related Challenges & Resolution in Azure SQL Databases

2023年6月9日

Managing Tempdb-Related Challenges & Resolution in Azure SQL Databases

The Tempdb system database serves as a shared resource accessible to users connected to Azure SQL Database or any SQL…
Power BI Data Mart: A Powerful feature for self-service capabilities

2022年7月6日

Power BI Data Mart: A Powerful feature for self-service capabilities

After Power BI XMLA-Endpoint , this is the next feature which is going to add a lot of value to Power BI platform and…
Cloud Migration & Modernization Strategy

2022年5月18日

Cloud Migration & Modernization Strategy

Cloud is the next generation platform where every single enterprise and corporation will land. Before planning to…
Evolution of Next Generation Analytics- Traditional BI meets AI

2022年4月28日

Evolution of Next Generation Analytics- Traditional BI meets AI

As per Gartner prediction data will be fuel for a lot of development and growth coming up in digital industry. We are…
Power BI XMLA-Endpoint: Next Gen Powerful Cloud SSAS Platform

2022年2月25日

Power BI XMLA-Endpoint: Next Gen Powerful Cloud SSAS Platform

Technology and platform are dynamic in nature, they evolve and grow. We have been using Microsoft SQL Service Analysis…
Management & Governance of Azure Analysis Services Instance using DMV

2021年5月6日

Management & Governance of Azure Analysis Services Instance using DMV

While working on designing data semantic layer , we need to have a management report or toolkit where we can monitor…

See all articles

Supported plans

Permissions

Training operations

Sujeet Singh的更多文章

Introducing Snowflake Cortex AI: Build generative AI applications with fully managed LLMs and chat with your data services

Introducing Databricks Lake-Flow: A No-Code, Next-Generation Intelligent Solution for Data Engineering

Unleashing AI Core: A Beginner's Guide to Generative AI and Essential Concepts of AI Ecosystem

Managing Tempdb-Related Challenges & Resolution in Azure SQL Databases

Power BI Data Mart: A Powerful feature for self-service capabilities

Cloud Migration & Modernization Strategy

Evolution of Next Generation Analytics- Traditional BI meets AI

Power BI XMLA-Endpoint: Next Gen Powerful Cloud SSAS Platform

Management & Governance of Azure Analysis Services Instance using DMV

社区洞察