登录查看更多内容

"MicroPDT?" - Innovative solution to Optimizing Performance in Looker (11 min read)

Mehul Rajen Shah

Delivery Lead | GenAI, Data & Analytics | 2x C-Phi-O Leaderboard & TCS Young Achiever Award

发布日期: 2023年11月11日

In the dynamic landscape of data analytics, optimizing performance is a constant pursuit. Looker, a powerful Data Modelling and exploration platform along with Dashboarding Capabilities, offers a solution to enhance query speed and efficiency through Persistent Derived Tables (PDTs).

In this article, we'll explore an innovative solution "MicroPDT" in Looker and how they can be leveraged for performance optimization.

Article Agenda:

Persistent Derived Tables (PDT) and its challenges - 1 min
Aggregate awareness (AA) and its challenges - 1 min
MicroPDT? solution and its advantages over (PDT) and (AA) - 1 min
Steps to implement MicroPDT - 8 mins

Persistent Derived Tables (PDTs) and its challenges:

Looker strongly recommends to use PDT's to optimize query performance

Major challenge are:

It is a huge dataset with almost all columns and rows
Since the dataset is huge, the query has to traverse a lot of columns/rows to find results.

Aggregate Awareness

Looker also recommends using aggregate awareness (AA).

Some of its major challenges

Static combination of filters which needs to be defined?during build. the filter change then AA does not work
Limitations with custom dimensions and measures
It is a Native Derived table hence it introduces warnings while using user attributes and hence you cannot implement Row Level Security

MicroPDT

Micro PDT is a Hybrid solution leveraging the best of both the worlds - PDT and AA and at the same time avoiding the limitations they have.

In Layman terms, Understand this as a smallest dataset with required columns and rows which is required for Dashboard KPI or visualization. Instead of Dashboard going directly to Parent PDT's it will retrieve data from MicroPDT's which is quite smaller in size.

MicroPDT's are like child PDT's which sits in between the Parent PDT and the actual Dashboard KPI.

Lets understand how it works:

Its is a view Based solution and a SQL Derived table it works well with user attributes and hence can apply Row Level security.
We do not apply the filters unless required. The final filtration happens directly in the dashboard. This really avoids limitations which AA has.
We can leverage the generated SQL and custom build it for pre-storing the results
Data scanned is low since it only has to invoke the MicroPDT’s which is very small in size (MB's or KB's)
Cost of the query is less since it scans very less data. Hence saves some ??

Thinking why MicroPDT - "I couldn't thing of a better name but "MicroPDT?" since it really works at micro level and giving performance at larger level much better than PDT's" - Mehul Rajen Shah

Steps to implement MicroPDT

Step 1> Explore the long running KPI from the Dashboard

Step 2> Open the SQL tab (under Data ribbon) and Click on Open in SQL Runner

领英推荐

Data Navigation and Analysis with Universal Search and…

Adam Morton 8 个月前

How to know your data?

Coditation 1 年前

Update August 2024

Wael BAAZOUZI 7 个月前

Step 3> Run the query to see the results

Step 4> Click Settings and Get Derived Table LookML

Step 5> This should give you something like below, click add it to your project.

Step 6> Just select the right Project and give a standard view name. Click Add.

Step 7> Now the SQL Derived table view file (i.e. MicroPDT) has been added to your Project. Please note: this is still a Derived table and to make it a PDT you need to add persistence strategy.

To know more adding persistence please read this article

Step 8> In the code please remove any aggregations and replace that with column. We do not want any pre aggregations happening since this will limit us to slice and dice the data later.

Step 9> Add required dimensions and measures in the view file required for the Dashboard filters to work. Save.

Step 10> In the model file using this view file create an explore e.g. explore_sql_runner_query. Save & Validate.

Step 10> Now using this newly created explore build the original KPI in the explore

Step 11> Save as to a New or existing dashboard.

Now if you notice the original query scanned size and time would be much higher compared to the newly created one. MicroPDT has much lower dataset and hence the time to run the query against it is comparatively low.

Do we have any Downside, Yes:

If you have a table grid in the dashboard and if you turn that into a MicroPDT you may not see much of a size difference
Since sits on top of Parent PDT it uses extra space from your storage
Additional columns many not be by-default present. Hence if you need new dimensions/measures you may need to add it to the query first.

Our Story

I and my team at Quantiphi have developed and worked on this innovative solution which really helped to bring down the run times to massive 60+%. This has not only helped reduce the dashboard run times but also save a bit of cost of running heavy queries on large datasets.

About Me

Mehul Rajen Shah - have a more than a decade experience in Data Analytics & GenAI driving excellence in Delivery & Automation, Optimization Initiatives. I love to connect with people and discover new ideas.

Please share your thoughts if this helps and your feedback.

Rutuja Gugale

Framework Engineer at Quantiphi || Looker || GCP || Terraform

1 年

This really helped alot. Thanks Mehul Rajen Shah for exploring this innovative Micro PDT solution.

1 次回应

Mehul Rajen Shah

Delivery Lead | GenAI, Data & Analytics | 2x C-Phi-O Leaderboard & TCS Young Achiever Award

1 年

Note - this might look lengthy, but trust me the ROI of efforts is huge. Please comment if you have tried/tested and/or facing issues.

查看更多评论

要查看或添加评论，请登录

Mehul Rajen Shah的更多文章

Unlocking Business Insights with Looker PDT (Persistent Derived Tables) (~9 min read)

2023年9月24日

Unlocking Business Insights with Looker PDT (Persistent Derived Tables) (~9 min read)

In today's data-driven world, businesses rely on data analytics and business intelligence tools to make informed…

"MicroPDT?" - Innovative solution to Optimizing Performance in Looker (11 min read)

Mehul Rajen Shah

Delivery Lead | GenAI, Data & Analytics | 2x C-Phi-O Leaderboard & TCS Young Achiever Award

Persistent Derived Tables (PDTs) and its challenges:

Aggregate Awareness

MicroPDT

Lets understand how it works:

Steps to implement MicroPDT

领英推荐

Mehul Rajen Shah的更多文章

社区洞察

其他会员也浏览了

?? ???????????????? ???????????????? ???????? ?????? ?????? ????????! ???????? ?????????? ?????? ?????????????????? ??????????????????.

Microsoft Data Platform News 2024 - Week 09

Data News - May

Data Quality & Cleaning: The Foundation for Reliable Analysis

Microsoft Data Platform News 2024 - Week 01

Hex Maps revisited - is it a good way to visualise data?

Microsoft Data Platform News 2023 - Week 48

Unlocking the Power of Direct Lake Mode in Power BI: A Game-Changer for Large-Scale Data Analysis

Data News - October

Persistent Derived Tables (PDTs) and its challenges:

Aggregate Awareness

MicroPDT

Lets understand how it works:

Steps to implement MicroPDT

领英推荐

Mehul Rajen Shah的更多文章

Unlocking Business Insights with Looker PDT (Persistent Derived Tables) (~9 min read)

社区洞察

其他会员也浏览了

?? ???????????????? ???????????????? ???????? ?????? ?????? ????????! ???????? ?????????? ?????? ?????????????????? ??????????????????.

Microsoft Data Platform News 2024 - Week 09

Data News - May

Data Quality & Cleaning: The Foundation for Reliable Analysis

Microsoft Data Platform News 2024 - Week 01

Hex Maps revisited - is it a good way to visualise data?

Microsoft Data Platform News 2023 - Week 48

Unlocking the Power of Direct Lake Mode in Power BI: A Game-Changer for Large-Scale Data Analysis

Data News - October