Insert, Update and Delete Tricks
The previous articles?Insert, Update and Delete Plan Cost,?Asynchronous IO and Storage Arrays?and?IUD Performance Anomalies?in SQL Server combined described a set of circumstances in which anomalies can occur in write (IUD) queries. In brief, the anomaly can occur when there are many rows to be written, but the estimated number of rows
We might argue that recent generation systems allow the possibility of massive memory capacity and stupendous I/O capability all at reasonable cost. But the chasm between what is possible and what is actually deployed is equally great. Many deployments are in the cloud, where people in the infrastructure silo seem to think that 4-8GB memory per vCore is adequate. The greater issue is the infatuation with network storage resulting in deeply throttled (crippled) I/O.
It is true that much lower I/O latency is possible with SSD storage (sub-millisecond) compared to hard disks (5-15 milliseconds depending on class). But there is little assurance that low latency is maintained under high queue depth operation. In the application silo, too many (in management) are enamored of new technology over old discipline resulting in bloated data structures of questionable normalization.
If we have well configured infrastructure
This means we have two avenues of attack. One, influence the estimated number
Test Tables, Methods and Set Points
Methods from previous articles are retained. The tables have a clustered index that is unique on the single column ID. There are two nonclustered indexes, one on column C and the other on D. Each has an Included column. There may also be a foreign key column F. Table T has 24 rows per unique value of C and 25 rows per value of D. Table T1 has 249 rows per C value and 250 rows per D value.
Initial conditions for testing are as follows. A CHECKPOINT is issued to write dirty pages to disk. DBCC DROPCLEANBUFFERS then removes clean pages from the buffer cache. A series of queries to other tables are run to fill the buffer cache to or close to the target memory. The reason for this is that SQL Server has a different strategy for read I/O from a cold cache to bring the system more quickly to a warm cache state.
The SQL Update statement from previous articles is below.
UPDATE?dbo.T?SET?CreateD?=?GETDATE()?WHERE?C?=?301
The update statement modifies a column that happens to be in the Include list of the nonclustered indexes. Changes to the underlying table and each nonclustered index are required.
No pages from the table are in memory on the first execution. There will be I/O access to IAM, index root, intermediate and leaf level pages are required. When the query is executed again with a different parameter value, I/O is required mostly for the leaf level and possibly the first intermediate level above the leaf level. In other words, it is easier to correlate the physical I/O to the number of rows modified.
In?IUD Performance Anomalies?, I found examples showing asynchronous I/O access to the main table (clustered index) occurs at Estimated Number of Rows 25 and higher (See comment further on in this article). This is the same set point as for Key Lookup in Select queries. Asynchronous I/O for access to nonclustered indexes seem to occur at Estimated Number of Rows 250 or higher. In both cases, this would be evident from SET STATISTICS IO reporting read-ahead reads, even if this is technically random prefetching?
Estimated Number of Rows Approach
The first method of addressing IUD performance anomalies revolves around the estimated number of rows at the write operation. The number of rows in steps prior to the write operation is not the topic of discussion in this article.
We should make clear first that the purpose of data distribution statistics is to support row estimation. Under certain circumstances, for certain key values, it is possible to have an exact distribution value. But whether a given key value is exact or estimate is not the point.
In fact, we are not entirely interested in whether the estimate is accurate. Our interest is in whether the estimate is row estimate is above or below the asynchronous I/O set points and whether we think I/O, if required, should be synchronous or asynchronous. Furthermore, if we want or expect plan reuse, then there is the matter of plan compile parameter values versus different actual execution parameter values.
A query can have search criteria that is simply amenable to reasonable estimation. Row variation might also occur through propagation through multiple operations.
Below is the parameterized version of our reference Update query.
DECLARE?@P1?bigint
UPDATE?dbo.T?SET?CreateD?=?GETDATE()?WHERE?C?=?@P1
Suppose this is part of a stored procedure, and @P1 is a parameter of the procedure. On first execution, the value of @P1 is used for row estimation, and that estimate is used in the plan for further execution until force out or a recompile is triggered.
We can use the OPTION OPTIMIZE FOR hint to override the compile, example below.
DECLARE?@P1?bigint
UPDATE?dbo.T?SET?CreateD?=?GETDATE()?WHERE?C?=?@P1
OPTION(OPTIMIZE?FOR?(@P1=value))
We would chose the key value to OPTIMIZE FOR having either below 25 (or 26) rows if we do not want read-ahead at all, between 25 to 249 if we want read-ahead only the clustered index but not the nonclustered indexes, and above 250 if we want read-ahead on all.
The difficulty in this approach is that we do not control the statistics range high key values, and these may change. To an extent, there is a degree of repeatability due to the?Statistics?computation method.
It is necessary to find key values that have desired distribution and to confirm that the distribution does not change when statistics are update. This can make it an impractical approach for production environments with inflexible source control policies. It is also necessary to verify that the Option Optimize For has the expect outcome, including a check of which statistics were employed by the compiled plan.
Another option is graft statistics from another index having the same keys, including data type, explicit and implicit having desired distribution values. This can be done with the DBCC SHOW_STATISTICS with STATS_STREAM option from the source and UPDATE STATISTICS STATS_STREAM equals on the destination.
There is a third way to control statistics involving the use of a temporary table. In this, we are close to the second method of controlling I/O and will be discussed in the next section.
Preloading Pages into Buffer Cache
The second method of avoiding write performance anomalies to avoid read I/O
We might ask why introducing otherwise unnecessary queries that do nothing aside from reading pages into the buffer cache would help. The reason is that the write query holds exclusive locks on rows being modified while the query is in progress (note the new?Optimized locking?feature of Azure SQL, not in on-prem versions yet?).
If the write query is active for a protracted period, then the exclusive lock can block other queries resulting in downstream problems. In transferring the I/O to a Select query, shared locks are employed and only while the row is being accessed? Furthermore it is our goal to ensure that I/O generated by the read statement is asynchronous if appropriate.
Here it helps that Select queries has only the 25 (or 26) row set point for asynchronous I/O versus the write operator with a 25 row set point for the clustered index and 250 row set point for nonclustered indexes.
It is possible to preload necessary clustered and nonclustered index pages without a temp table. The choice of whether to do so depends on the complexity and efficiency of the filter criteria. The example given here employs a temp table for brevity.
The temporary table below is designed to contain necessary information to support the preloading queries. The main table clustered index key column(s) is present along with necessary nonclustered index key foreign key columns.
CREATE TABLE?#x(ID?bigint?NOT NULL,?C?bigint,?D?bigint,?F?bigint?)
For Insert and Delete, all nonclustered indexes need to be accessed except filtered indexes under certain circumstances. For update, only the nonclustered indexes in which either key or included columns that are modified need to be handled.
Foreign key columns should be populated for Insert statements, but only if modified for Update statements. Delete does not need to consider foreign key columns, but any table that foreign keys to this table should have an index.
Two methods of populating the temporary table are described here. One is a multiple step approach. The cluster key column is inserted in the first step. The other columns are populated with a subsequent update. The second is a single step approach in which all columns are inserted using the query with the filter criteria.
Temp Table Populated in Multiple Steps
The two-step approach begins with populating only the clustered index key using the original write statement filter criteria. Example below.
INSERT?#x(ID)?SELECT?ID?FROM?dbo.T?WHERE?C?=?@P1
The execution plan for the above is a nonclustered index seek for a range of rows. This plan is expected to generate read-ahead reads for any additional pages beyond the first leaf level page.
The second step is to update the temporary table to populate the other columns.
UPDATE?x?SET?C?=?T.C,?D?=?T.D,?F?=?T.F
FROM?#x x?JOIN?dbo.T?ON?T.ID?=?x.ID
The execution plan for the Update should be a Scan of #x followed by a (Nested Loops) Join to T1 with a clustered index seek unless the row count is excessive or there is a peculiar nonclustered index having the same key as the clustered index.
If there is the peculiar nonclustered index, then consider a query index hint for the clustered index.
The filter criterion in this example is simple and the two above steps can easily be consolidated. One reason for dividing the process into two steps is for situations in which the filter criteria is sufficiently complicated that the query optimizer cannot make an accurate row estimate.
This can occur when there are two or more inequality conditions or there is no index matching all the equality conditions. It is not uncommon for complex logic in the filter criteria to result in poor accuracy for the estimated number of rows.
By only outputting the clustered index key column, implicitly part of every nonclustered index, the first Insert/Select plan should only have a nonclustered index access and not a Key Lookup operation.
If six or more rows are inserted to the temp table, the next query to use #x will trigger statistics, and in turn a recompile. The statistics and recompile ensures the second statement updating the other columns of #x are based on actual rows and not an estimate.
If we do not trust the SQL Server automatic statistics recompute set points, then we could have an explicit Create Statistics or Create Index statement on the temp table after the Insert and before the Update.
Transplanted Statistics
If for some reason, we are unhappy with actual statistics on the temp table #x, we can transplant statistics. Outside of the procedure with the write statement, we create the temp table #x, populate the table with the desired number of rows.
领英推荐
Create a clustered index on the ID column,
CREATE UNIQUE CLUSTERED INDEX?UCX?ON?#x(ID)
Run DBCC SHOW_STATISTICS as below.
DBCC SHOW_STATISTICS('tempdb..#x',UCX)?WITH STATS_STREAM
The output for our example is shown below.
Within the procedure having our write statement, create the temp table, populate the cluster key columns (just ID in this example), and create the same unique clustered index as above.
Update the statistics on the index as follows using the output of the DBCC SHOW_STATISTICS on the reference table.
UPDATE?STATISTICS?#x(UCX)?WITH STATS_STREAM?=?0x01000000010000000000000000000000...
,?ROWCOUNT?=?50?,?PAGECOUNT?=?1
Comment: In?Asynchronous IO and Storage Arrays, I showed that read-ahead (random prefetch) activates for Key Lookup at Estimated Number of Executes 25. It would seem that this occurs at Estimated Number of Executes 26 for a (nested loops) join, even though both operations employ Nested Loops. The main point is to be aware that there may be more unknowns in the actual code path than demonstrated in the small number of examples given in this series of articles.
Below left is the Scan details with 24 rows in the table based on actual statistics.
Below right is the detail with the Update Statistics setting false values.
After applying the transplanted statistics, we can update #x setting the unpopulated columns with a join to the main table on the clustered index key. Depending on the row count of our transplanted statistics, the Update query should generate asynchronous or synchronous I/O as desired based on our transplant statistics.
Given that we have just modified rows of the temp table, the next query using the #x may trigger a statistics recompute. If we desired to continue using the transplanted data distribution statistics, I believe the procedure should be as follows.
First update the statistics explicitly the normal way. This should reset the rows modification counter back to zero, indicating no update is necessary.
UPDATE?STATISTICS?#x(UCX)
The update with the transplanted statistics as earlier. I have not run this recently, and some investigation is warranted.
Temp Table Populated in Single Steps
If we are not concerned by row estimate uncertainties in the filter criteria, then we can populate the temp table #x in a single step.
Insert rows to be modified into #x.
INSERT?#x(ID,?C,?D,?F)
SELECT?ID,?C,?D,?F?FROM?dbo.T?WHERE?C?=?@P1
The plan is shown below.
The Index Seek (NonClustered) implements the filter criteria. The Key Lookup to the Clustered index gets the columns needed that are not in the nonclustered index.
If the nonclustered index has all the columns that we need to obtain, then there is no key lookup. This may be great for Select query efficiency (at the expense of higher Insert and Update overhead). But here our goal is to load the main table (clustered index) pages.
In this case, the two-step approach is appropriate. A nonclustered index is used to evaluate the filter criteria. The second step should access the clustered index as the join criteria is the clustered index key.
Loading Index Pages to Buffer Cache
With the write row set populated in the temp table #x, we can now access the nonclustered index pages that will be written to in the write statement. The access has the effect of loading the page into buffer cache.
The example below has an index hint to force the index.
DECLARE?@p1?bigint
SELECT?@p1?=?COUNT(*)
FROM?#x x?JOIN?dbo.T?WITH(INDEX(IXD))?ON?T.D?=?x.D?AND?T.ID?=?x.ID
Notice the join condition specifies both the nonclustered index key and the clustered index key. This ensures a one-to-one join if the clustered index key is unique, avoiding a many-to-many join. The index hint ensures the use of the intended nonclustered index instead of the clustered index.
Each index that requires modification in the write statement should be touched.
Loading Foreign Key Parent Pages to Buffer Cache
The method to touch pages for the foreign key parent table is done in a similar manner, with some changes. In our example, the foreign key column is F, and the parent table is F with key column also F.
SELECT?@P1?=?COUNT(*)
FROM?(SELECT DISTINCT?F?FROM?#x)?x
JOIN?dbo.F f?ON?f.F?=?x.F
The Write Statement
The final task is to accomplish the goal of the original write statement having most if not all pages to be written in buffer cache. We can just execute the original statement. If there are complexities in the original statement, it can be more efficient to use our temp table to identify the rows as below.
UPDATE?T?SET?...?FROM?#x x?JOIN?dbo.T?ON?T.ID?=?x.ID
WHERE?C?=?@P1
If we are paranoid about whether the previously identified rows still meet the filter criteria, we can also include it as well.
Other Use Case
There is another use case for the multi-step method with the temp table, though not in the scope of the main topic. It is mentioned because the methods are largely the same.
Consider an Update statement in which the filter criteria is somewhat complicated. Many rows must be examined, but few rows actually meet the complete criteria. Suppose one of the criteria is an equality condition that matches a nonclustered index key. The execution plan is an index seek on the nonclustered index but still must evaluate many rows.
Instead of executing the Update statement directly, Select the clustered index key into a temp table. The same number of rows are evaluated, but only a shared lock is taken on each row. Use the temp table to join back to the main table for the Update. Exclusive locks are only taken on the rows actually updated.
Summary
The rather complicated set of steps here must be crafted to the specific clustered and nonclustered indexes, and foreign keys of the target table. It should only be necessary for writes of many rows under specific circumstances of the estimated number of rows being below the asynchronous I/O threshold, yet actual number of rows sufficiently high for this to be a problem and the pages to be written not being in buffer cache.
This problem used to be more severe in the hard disk storage era when the best storage systems at low queue depth per HDD could support 5 ms access latency. An overloaded system could be operating at over 20ms latency. In the SSD era, 100 μs latency is possible in enterprise devices operating within an appropriate queue depth. However 1ms and higher latency could still happen. This is more likely when poor network bandwidth is a factor.
The situation to watch for is estimated rows being low but actual rows very high. This is most likely to happen when the filter criteria has complexity that prevents the query optimizer from making a good estimate.
References
Microsoft Learn:??Reading Pages,???Writing Pages,??Statistics.
Posts by?Craig Freedman:??Sequential Read Ahead,???Random Prefetching
Sr SQL Server DBA
1 年Thank you