SQL Server Execution Plan Shaping

SQL Server Execution Plan Shaping

Execution Plan Shaping in SQL Server

A very long time ago, in the early days of relational databases, when system memory was not terabytes or even gigabytes, but megabytes and single digit at that, complex queries (joining more than four tables?) were beyond the capability of the query optimizer of that era. In that time, one of the database professional's skills was to know the method of writing SQL that explicitly specified the join sequence between tables. The sequence and order of joins is represented in the execution plan shape, along with other operators.

By the mid or late 1990's, the query optimizer could handle more complex queries, although producing a plan did take some time. With this, explicit plan shaping quickly became a disregarded and no longer taught skill? Sometime after 2010 or thereabouts, the SQL Server query optimizer could even produce plans for complex queries quickly and efficiently. Presumably, this is by knowing what not to evaluate from the near astronomical number of possible plans. More recently, SQL Server beginning with version 2017 introduced the first element of what was to become?Intelligent Query Processing?(IQP). Each new version then expanding the scope.

With these developments, is there a place for explicit plan shaping? That is: employing intelligent SQL professionals. Or is the future for engine/optimization illiterates relying on the intelligence of software? The purpose of this article is mostly to record the method of plan shaping before it is lost to time and to a lesser extent, argue for it's role. A more detail explanation for the role of plan shaping may follow in a future article.

Plan Shapes

In Microsoft?Compatibility certification:?Query plan shape refers to the visual representation of the various operators that make up a query plan. This includes operators like seeks, scans, joins, and sorts, as well as the connections between them that indicate the flow of data and the order of the operations that must be executed to produce the intended result set.

Here, we are concerned with the shape or sequence of joins. There are other operators that appear by necessity of the query logic. The emphasis here is on the shape aspects of joins only, not the order of tables or join type.

In a join between two tables, A and B having one join condition, there is only one shape as shown below.

No alt text provided for this image
No alt text provided for this image

There are two possible table orders for the join: A as the outer input (top right) and B as the inner input (lower center), or B and A switched. There are three?Join?types: Nested Loops, Hash and Merge, excluding Adaptive join (join type determined at run time). In either case, the shape of the plan is the same.

Note, the join type may affect the choice of index, or rather the indexes present may affect the choice of join type. A bookmark lookup operation may also be present to access necessary columns not in the index. Here, the discussion is almost entirely on the plan shape for joins only and not additional necessary operations like the bookmark lookup.

Three Tables

Now look at a join between three tables. There are two possible plan shapes. (is this because there are two join conditions between the three tables?) One is for the joins as written below.

No alt text provided for this image

Note the actual query for the accompanying plans have?OPTION?(FORCE ORDER) to disable the query optimizer from rearranging the join sequence. This allows us to show the plan for the query as written. Interpreting the SQL literally (without optimization), this means Table A as outer input joins to B as inner input, and then as outer input, joins to C as inner input.

No alt text provided for this image

The other plan shape in joining three tables is below.

No alt text provided for this image

Again, interpreting the SQL literally without optimization: Table A as outer input joins the join of B as outer and C inner input, with the B+C join as the inner input in the join to A. The plan is below.?Itzik Ben-Gan?calls this a bushy join in one of his articles.

No alt text provided for this image

The following might be clearer with the parenthesis, which are unnecessary.

No alt text provided for this image

For each of the plan shapes, there are six possible permutations of table order (3 factorial), although this is not the topic here.

Four Tables

In joining four tables, there are three join conditions. I can come up with five plan shapes. If someone is aware of the mathematical formula for the number of plan shapes, please share.

No alt text provided for this image

As written and interpreted literally, the plan shape below shows A as outer input joins B as inner. This acts as the outer input in joining to C as inner input. Finally, this is the outer input in the join to D as inner.

No alt text provided for this image

The second join sequence below is for A as outer joins to inner input being the join of B to C. This then is the outer input joining to D as inner input.

No alt text provided for this image

The plan for the second sequence is:

No alt text provided for this image

The third join sequence is A as outer input. The inner input is: B as outer join to C as inner, then this being the outer in joining to D as inner.

No alt text provided for this image

The plan for the third sequence is:

No alt text provided for this image

The fourth join sequence is A as outer joins B as inner, then acting as outer joins with the inner input as C join D.

No alt text provided for this image

The plan for the fourth sequence is:

No alt text provided for this image

The fifth join sequence is below.

No alt text provided for this image

The plan for the fifth sequence is:

No alt text provided for this image

Notice that in the second and third join sequences, the first join condition is pushed down one and two places respectively. In the fourth join sequence, the second join condition is pushed down.

In the fifth sequence, the first and second join conditions are pushed down. This pattern should help the mathematically inclined to produce a formula. But this pattern may not represent all possibilities?

In practice, only certain join orders or plan shape may make sense because of the nature of the join conditions? This is excluding index considerations.

Usage

Without the use of hints, the query optimizer attempts to find a good plan within a reasonable time and disregards the join order as written. See?Understanding Optimizer Timeout and how Complex queries can be Affected in SQL Server, Joseph Pilov. To force a specific join order, a?query hint?of some sort must be used. A subset of query hints are?Join Hints.

First, a brief digression. Certain hints are not really hints, but directives. When used, a number of query optimizations maybe disabled. If we use a hint, be prepared to handle query optimization explicitly. Below are the warnings and advice from Microsoft documentation on hints.

The hints override any execution plan the query optimizer might select for a query.

Caution?Because the SQL Server query optimizer typically selects the best execution plan for a query, we recommend that <join_hint>, <query_hint>, and <table_hint> be used only as a last resort by experienced developers and database administrators.

That said, the two methods for forcing join order I have used are:?OPTION?(FORCE ORDER) or a join hint. Example: A INNER LOOP JOIN B. Option Force Order is just that. The join order is as written. The join types are up to the query optimizer.

A join hint, LOOP, HASH or MERGE, forces the join type on the specified join as well as forces the join order for the entire query as written. Example:

No alt text provided for this image

Any join without specified join type is up to the optimizer. The important aspect of join hints is that join order is also forced. There does not exist a hint to use a specified join type between two tables, yet leave the join order up for optimization, even if only for the uninvolved tables.

Use Cases

Microsoft warns not to use hints except as a last resort, and even then only with additional qualifiers. In other words, if a hint is used, Microsoft does not accept responsibility for any consequences.

If a hint is used in a production system, there probably needs to be a plan to handle the situation should something goes wrong. Is a qualified DBA monitoring the system? Can a procedure be changed to remove the hint if a problem is detected? What level of review and escalation is necessary to make the change? Presumably, the hint was used to avoid some problem in the first place. If there is a problem from the hint, removing the hint may move the new problem back to the original problem?

It is easier to justify the practice of plan shaping (with hints) in a development environment. One example is as an investigative tool. Consider some query. The query optimizer produces a certain plan shape and join types.

Suppose we believe the plan shape to be correct but would like to change a join type. The approach is to first write the query using methods described above to produce the desired plan shape. Verify this with the Display Estimated Execution Plan feature in SSMS. The join types should be the same as the unhinted execution plan. Verify that the actual execution time and CPU is about the same as the original. Now, specify the join type at the location in question.

Summary

The method of explicit or manual plan shaping is not difficult. It is simply a forgotten technique in the world of highly capable query optimization. It is even a discouraged technique given Microsoft's preference for intelligently fixing problematic plans behind the scenes without code change. While (IQP) is great feature to have, it should not be used an excuse to encourage sloppy query writing. There will be problems which even a very powerful behind the scenes intelligence cannot solve. We certainly do not want to incur the overhead of adaptive intervention in high frequency operations.

Addendum

I have just been alerted by Randall G that the number of plan shapes can be represented by Catalan numbers, with n being the number of joins. The number of plan shapes for 2, 3, 4, 5 and 6 table joins are 1, 2, 5, 14 and 42. The Ultimate Question of Life, the Universe and Everything can be found in a six table join!

References

5 Table joins in Execution Plan Shapes 2

Intelligent Query Processing?(IQP), 2022 version.

Benjamin Nevarez?Inside the SQL Server Query Optimizer

Surajit Chaudhuri?Overview of Query Optimization

要查看或添加评论,请登录

Joe Chang的更多文章

  • US Population from Neilsberg vs SSA (Elon)

    US Population from Neilsberg vs SSA (Elon)

    update : data from census.gov, not sure if actual or estimated, additional bracket for 80-89, 90-99 and 100+.

    1 条评论
  • single to multi-socket scaling

    single to multi-socket scaling

    Lenovo and AMD recently published TPC-E benchmark result for the 2-socket EPYC 9554. Most recent AMD TPC-E results have…

    5 条评论
  • SQL Server Performance from Intel Comet Lake to Raptor Lake

    SQL Server Performance from Intel Comet Lake to Raptor Lake

    A year ago, I reported on the performance characteristics of basic SQL Server operations (L2 Cache Size…

  • Insert, Update and Delete Tricks

    Insert, Update and Delete Tricks

    The previous articles Insert, Update and Delete Plan Cost, Asynchronous IO and Storage Arrays and IUD Performance…

    1 条评论
  • Filtered Statistics Tricks in SQL Server

    Filtered Statistics Tricks in SQL Server

    Data distribution statistics is one the foundational elements of cost-based query optimization in modern relational…

    2 条评论
  • L2 Cache Size & SQL Performance

    L2 Cache Size & SQL Performance

    L2 Cache Size Impact on SQL Server Performance Edit 2023-04, Update for Raptor Lake L2 Cache Size Impact on SQL Server…

  • SQL Server Joins - 2 SARGs

    SQL Server Joins - 2 SARGs

    SQL Server Join Costs, 2 SARG The previous Join costs covered join with a search argument on one source only at DOP 1…

    2 条评论
  • SQL Server Parallel Plan Cost

    SQL Server Parallel Plan Cost

    SQL Server Parallel Join Costs Up until the early 2000's, microprocessor vendors focused on improving the performance…

  • SQL Server Join Costs

    SQL Server Join Costs

    SQL Server Join Costs, 1 SARG Here we look at Loop, Hash and Merge joins with an equality value search argument (SARG)…

    2 条评论
  • SQL Server Key Lookup & Scan

    SQL Server Key Lookup & Scan

    SQL Server Key Lookup & Scan, Plan vs. Actual Costs Previously in Plan Cost Cross-Over, we looked at the SQL Server…

社区洞察

其他会员也浏览了