Quick Tip - High Performance Running Total
In this post we will illustrate a high performance method for calculating a running total by traversing a table index.
I’ve seen many examples of people using sub queries and CROSS JOIN’s in order to accomplish the same thing, the only problem with these is the overhead of joining so many rows together.?In this example I’ll illustrate how to traverse an index and create a running total at the same time.
We are going to use the AdventureWorksDW2008.?I am going to create a table that summaries the sales from the FactInternestSales table and then apply the running total.
First create the table that we will use to apply the running total, note that I've created a clustered index on this table. The clustered index is very important as we’ll us the index to ensure that the running total is created in the correct order.
USE AdventureWorksDW2008
GO?
IF OBJECT_ID('RunningTotal', N'U') IS NOT NULL?
? ? DROP TABLE RunningTotal;
CREATE TABLE RunningTotal?
(
? CustomerKey INT NOT NULL
? ,OrderDateKey INT NOT NULL
? ,SalesAmount money NOT NULL
? ,RunningTotal money NULL
)?
CREATE CLUSTERED INDEX idx1 ON RunningTotal (CustomerKey, OrderDateKey);
Next populate the table with summarised data from the FactInternestSales table.?Because a customer can have multiple orders on a day with multiple product we need to aggregate the data up to a single record for the day.
领英推荐
INSERT INTO RunningTotal
(? ?
? ?CustomerKey,
? ?OrderDateKey,
? ?SalesAmount
)
SELECT
CustomerKey
,OrderDateKey
,SUM(SalesAmount) AS SalesAmount
FROM
FactInternetSales WITH(NOLOCK)
GROUP BY
CustomerKey
,OrderDateKey
Now for the interesting bit.?T-SQL supports a method by which you can assign a value to a field and update a variable at the same time.
The calculation of the equation is calculated from right to left, so first we identify in the CASE statement if the CustomerKey on this record is the same as the CustomerKey on the previous record.?If they are the same then we take the current @RunningTotal value and add the SalesAmount to this.?This is then updated into the RunningTotal field on the table and then this intern updated the @RunningTotal variable with the new total, ready to be used in the next iteration of the update.?If the CustomerKeys are not the same then we know this is a new customers RunningTotal and so the first entry will be equal to the SalesAmount on that day.
Notice that we also update @PrevCustomerKey which will be used for comparison in the next iteration of the update.
The real trick here is to force the query to use the index, this way you ensure that the records are read and updated in the correct sequence.
UPDATE RunningTotal
SET @RunningTotal = RunningTotal =?
CASE?
WHEN (CustomerKey = @PrevCustomerKey)?
THEN @RunningTotal + SalesAmount?
ELSE SalesAmount?
END
,@PrevCustomerKey = CustomerKey?
FROM?
RunningTotal WITH (INDEX(idx1), TABLOCKX);
SELECT?
*?
FROM?
RunningTotal WITH(NOLOCK)
ORDER BY
CustomerKey
,OrderDateKey