Understanding Power BI: Formula Engine, Storage Engine, and VertiPaq vs. DirectQuery

Formula Engine and Storage Engine, ensure efficient query processing. This article delves into how these engines work together, along with a practical example using dummy data. Additionally, we'll explore the differences between VertiPaq and DirectQuery storage modes.


Formula Engine (FE):

  • Role: The Formula Engine is responsible for interpreting DAX (Data Analysis Expressions) queries. It processes the logic, optimizes the query, and performs calculations.
  • Example: Consider a query to calculate the total profit for products with sales exceeding 1200. The Formula Engine will interpret the logic and generate an optimized execution plan.
  • Sequential Processing: The Formula Engine operates sequentially, handling one operation at a time.


Storage Engine (SE):

  • Role: The Storage Engine retrieves data as requested by the Formula Engine. It works in two modes:
  • VertiPaq: Uses an in-memory columnar database to store and compress data, enabling fast performance.
  • DirectQuery: Directly queries data from the source without importing it into memory.
  • Parallel Processing: The Storage Engine can process multiple operations simultaneously, optimizing data retrieval.


VertiPaq:

  • Columnar Storage: VertiPaq is a columnar database that compresses imported data in Power BI.
  • Performance: Since the data is stored in RAM, the read speed is extremely fast.
  • Compression: Data is highly compressed to reduce memory usage and improve performance.

DirectQuery:

  • Live Connection: DirectQuery connects to live data sources and queries them without importing data.
  • Performance Considerations: Query response depends on the speed of the data source.
  • Use Case: Best for scenarios where real-time data or very large datasets are required, making importing impractical.


Let’s illustrate this with a simple dataset:


Example Data

Query:

Calculate total profit for products where Sales > 1200.

  • Products B, D, and E meet the condition.
  • Total profit = 300 + 260 + 280 = 840.


Formula Engine & Storage Engine Process

Formula Engine & Storage Engine Process


The process flow diagram explains the backend interaction between the engines:

  • Formula Engine's Role:

  1. Interprets the query logic (e.g., filter products with Sales > 1200).
  2. Generates an optimized query plan.


  • Storage Engine's Role:

  1. Retrieves data (Sales and Profit) for products B, D, and E from VertiPaq or DirectQuery.
  2. Sends the filtered data back to the Formula Engine.


  • Results:

  1. Formula Engine calculates the total profit (300 + 260 + 280 = 840).
  2. Displays the result in a Power BI visualization (e.g., the chart diagram).



Sales and Profit Example

Diagram:

  • Bars: Represent sales for each product.
  • Line: Indicates profit values.
  • Sales Threshold Line (Orange): Marks the condition Sales>1200\text{Sales} > 1200Sales>1200.
  • "Filtered" Labels (Green): Highlights products that satisfy the condition.

要查看或添加评论,请登录

Sourabh Dharmpurikar的更多文章

社区洞察