登录查看更多内容

Quick Tip - High Performance Running Total

datalab

We are your data analytics and business intelligence implementation partner, connecting your people, processes and data.

发布日期: 2022年4月11日

In this post we will illustrate a high performance method for calculating a running total by traversing a table index.

I’ve seen many examples of people using sub queries and CROSS JOIN’s in order to accomplish the same thing, the only problem with these is the overhead of joining so many rows together.?In this example I’ll illustrate how to traverse an index and create a running total at the same time.

We are going to use the AdventureWorksDW2008.?I am going to create a table that summaries the sales from the FactInternestSales table and then apply the running total.

First create the table that we will use to apply the running total, note that I've created a clustered index on this table. The clustered index is very important as we’ll us the index to ensure that the running total is created in the correct order.

USE AdventureWorksDW2008
GO?

IF OBJECT_ID('RunningTotal', N'U') IS NOT NULL?
? ? DROP TABLE RunningTotal;

CREATE TABLE RunningTotal?
(
? CustomerKey INT NOT NULL
? ,OrderDateKey INT NOT NULL
? ,SalesAmount money NOT NULL
? ,RunningTotal money NULL
)?

CREATE CLUSTERED INDEX idx1 ON RunningTotal (CustomerKey, OrderDateKey);

Next populate the table with summarised data from the FactInternestSales table.?Because a customer can have multiple orders on a day with multiple product we need to aggregate the data up to a single record for the day.

领英推荐

PlanStudio: Not Just a Plan...

Locatrix 1 年前

Surviving June! - Month of June 2023

The Autonomous Robotic Vehicle Project (ARVP), University of Alberta 1 年前

Do Gantt charts drive you crazy?

Charter and Go 1 年前


INSERT INTO RunningTotal
(? ?
? ?CustomerKey,
? ?OrderDateKey,
? ?SalesAmount
)
SELECT
	CustomerKey
	,OrderDateKey
	,SUM(SalesAmount) AS SalesAmount
FROM
	FactInternetSales WITH(NOLOCK)
GROUP BY
	CustomerKey
	,OrderDateKey

Now for the interesting bit.?T-SQL supports a method by which you can assign a value to a field and update a variable at the same time.

The calculation of the equation is calculated from right to left, so first we identify in the CASE statement if the CustomerKey on this record is the same as the CustomerKey on the previous record.?If they are the same then we take the current @RunningTotal value and add the SalesAmount to this.?This is then updated into the RunningTotal field on the table and then this intern updated the @RunningTotal variable with the new total, ready to be used in the next iteration of the update.?If the CustomerKeys are not the same then we know this is a new customers RunningTotal and so the first entry will be equal to the SalesAmount on that day.

Notice that we also update @PrevCustomerKey which will be used for comparison in the next iteration of the update.

The real trick here is to force the query to use the index, this way you ensure that the records are read and updated in the correct sequence.


UPDATE RunningTotal
SET @RunningTotal = RunningTotal =?
	CASE?
		WHEN (CustomerKey = @PrevCustomerKey)?
		THEN @RunningTotal + SalesAmount?
		ELSE SalesAmount?
	END
	,@PrevCustomerKey = CustomerKey?
FROM?
	RunningTotal WITH (INDEX(idx1), TABLOCKX);


SELECT?
	*?
FROM?
	RunningTotal WITH(NOLOCK)
ORDER BY
	CustomerKey
	,OrderDateKey

Quick Tip - High Performance Running Total

datalab

We are your data analytics and business intelligence implementation partner, connecting your people, processes and data.

领英推荐

datalab的更多文章

社区洞察

其他会员也浏览了

Implement Anything: Understanding ROI - #BIMROIds? No prob. (Part 3!)

Put Your Thumb on the Hose

GLITCHING OUT

This F1 team runs on Microsoft Excel

A discussion with my 5 year-old that enhanced the quality of my Job & Health

The Myth of Build/Buy Decisions

Magic pencil and the Vee-model

Everyday Sacred Spaces

How Making Paper Airplanes led to a Manufacturing Startup for Video-Based Operating Procedures

Can't Decide? Try This.

领英推荐

datalab的更多文章

Converting PDFs to Usable Data

Fresh Produce in South Africa

Understanding Laboratory Time Allocation

Understanding Laboratory Demand

Harnessing the Power of Laboratory Data

Quick Tip - Using SQL Row Constructors

Quick Tip - SQL Multiplication Without a Cursor

社区洞察

其他会员也浏览了

Implement Anything: Understanding ROI - #BIMROIds? No prob. (Part 3!)

Put Your Thumb on the Hose

GLITCHING OUT

This F1 team runs on Microsoft Excel

A discussion with my 5 year-old that enhanced the quality of my Job & Health

The Myth of Build/Buy Decisions

Magic pencil and the Vee-model

Everyday Sacred Spaces

How Making Paper Airplanes led to a Manufacturing Startup for Video-Based Operating Procedures

Can't Decide? Try This.