Understanding Query Folding in Power Query

Understanding Query Folding in Power Query

Query folding is an optimization process where Power Query tries to push as much execution as possible to data source either on-premises or in the cloud. The Query Folding determines which parts of query can be executed directly by the data source such as SQL Server, Fabric Lakehouse, Fabric Warehouse etc.

Demo

In the screenshot below, I have transaction2016 and transaction2017 as #delta tables resident in Lakehouse1 and Lakehouse2 within my Fabric Tenant. Dataflow Gen2 is used to access and append the two tables from the Lakehouses.

?After the append operation, in the applied step, the Source step shows a green indicator. This implies that the step in evaluated at the source data level which in my case, are the Lakehouse1 and Lakehouse2 respectively. Power Query pushed the execution to the source data and this is called Query Folding

A right-click on the folding Source steps shows a fly over that allows viewing query plan

In the Query Plan, the Value.NativeQuery is visible that shows all the selected columns in the LH1 (Lakehouse1) table called transaction2016 with union all SQL operator appending equal number of columns from the the LH2 (Lakehouse2) table called transaction2017

A click on the Sql.Database view details shows that EnableCrossDatabaseFolding is set to true implying query folding



Conversely, within the same applied steps, Groupby operation is performed that returned a red indicator. This implies that the GroupBy step is evaluated outside the source data (evaluated locally and not-folding)

In conclusion, by pushing as much execution to the source data as possible, data transfer and processing within the Power Query engine is reduced. In addition, only relevant data is retrieved, leading to faster query execution.


要查看或添加评论,请登录