Useful Analytic functions
#SQL_Server supports these analytic #functions :
*?CUME_DIST (Transact-SQL)
*?FIRST_VALUE (Transact-SQL)
*?LAG (Transact-SQL)
*?LAST_VALUE (Transact-SQL)
*?LEAD (Transact-SQL)
Analytic functions calculate an aggregate value based on a group of rows. Unlike aggregate functions, however, analytic functions can return multiple rows for each group. Use analytic functions to compute moving averages, running totals, percentages or top-N results within a group.
1.???CUME_DIST
For SQL Server, this function calculates the cumulative distribution of a value within a group of values. In other words, CUME_DIST calculates the relative position of a specified value in a group of values. Assuming ascending ordering, the CUME_DIST of a value in row r is defined as the number of rows with values less than or equal to that value in row r, divided by the number of rows evaluated in the partition or query result set.
Syntax:
CUME_DIST( ) ?OVER ( [ partition_by_clause ] order_by_clause )?
Arguments
OVER?(?[?partition_by_clause?]?order_by_clause)
The?partition_by_clause?divides the FROM clause result set into partitions, to which the function is applied. If the?partition_by_clause?argument isn't specified,?CUME_DIST?treats all query result set rows as a single group. The?order_by_clause?determines the logical order in which the operation occurs.?CUME_DIST?requires the?order_by_clause.?CUME_DIST?won't accept the <rows or range clause> of the OVER syntax.
Return types
float(53)
Remarks
CUME_DIST returns a range of values greater than 0 and less than or equal to 1. Tie values always evaluate to the same cumulative distribution value. CUME_DIST includes NULL values by default and treats these values as the lowest possible values. CUME_DIST is nondeterministic.
2.???FIRST_VALUE
Returns the first value in an ordered set of values.
syntax
FIRST_VALUE ( [scalar_expression ] )?[ IGNORE NULLS | RESPECT NULLS ]OVER ( [ partition_by_clause ] order_by_clause [ rows_range_clause ] )
Arguments
scalar_expression
The value to be returned.?scalar_expression?can be a column, subquery, or other arbitrary expression that results in a single value. Other analytic functions aren't permitted.
[ IGNORE NULLS | RESPECT NULLS ]
Applies to: SQL Server (starting with SQL Server 2022 (16.x)), Azure SQL Database, Azure SQL Managed Instance, Azure SQL Edge
IGNORE NULLS - Ignore null values in the dataset when computing the first value over a partition.
RESPECT NULLS - Respect null values in the dataset when computing first value over a partition.
OVER ( [?partition_by_clause?]?order_by_clause?[?rows_range_clause?] )
The?partition_by_clause?divides the result set produced by the FROM clause into partitions to which the function is applied. If not specified, the function treats all rows of the query result set as a single group.
The?order_by_clause?determines the logical order in which the operation is performed. The?order_by_clause?is required.
The?rows_range_clause?further limits the rows within the partition by specifying start and end points.
Return types
The same type as?scalar_expression.
Remarks
FIRST_VALUE?is nondeterministic.
3.???LAG?
Accesses data from a previous row in the same result set without the use of a self-join starting with SQL Server 2012 (11.x). LAG provides access to a row at a given physical offset that comes before the current row. Use this analytic function in a SELECT statement to compare values in the current row with values in a previous row.
Syntax:
LAG (scalar_expression [,offset] [,default]) OVER ( [ partition_by_clause ] order_by_clause )?
领英推荐
Arguments
scalar_expression
The value to be returned based on the specified offset. It is an expression of any type that returns a single (scalar) value.?scalar_expression?cannot be an analytic function.
Offset
The number of rows back from the current row from which to obtain a value. If not specified, the default is 1.?offset?can be a column, subquery, or other expression that evaluates to a positive integer or can be implicitly converted to?bigint.?offset?cannot be a negative value or an analytic function.
Default
The value to return when?offset?is beyond the scope of the partition. If a default value is not specified, NULL is returned.?default?can be a column, subquery, or other expression, but it cannot be an analytic function.?default?must be type-compatible with?scalar_expression.
OVER?(?[?partition_by_clause?]?order_by_clause)
partition_by_clause?divides the result set produced by the FROM clause into partitions to which the function is applied. If not specified, the function treats all rows of the query result set as a single group.?order_by_clause?determines the order of the data before the function is applied. If?partition_by_clause?is specified, it determines the order of the data in the partition. The?order_by_clause?is required.
Return Types
The data type of the specified?scalar_expression. NULL is returned if?scalar_expression?is nullable or?default?is set to NULL.
4.???LAST_VALUE
Returns the last value in an ordered set of values.
Syntax
LAST_VALUE ( [ scalar_expression ] )?[ IGNORE NULLS | RESPECT NULLS ]
???OVER ( [ partition_by_clause ] order_by_clause [ rows_range_clause ] )
Arguments
scalar_expression
Is the value to be returned.?scalar_expression?can be a column, subquery, or other expression that results in a single value. Other analytic functions aren't permitted.
[ IGNORE NULLS | RESPECT NULLS ]
Applies to: SQL Server (starting with SQL Server 2022 (16.x)), Azure SQL Database, Azure SQL Managed Instance, Azure SQL Edge
IGNORE NULLS - Ignore null values in the dataset when computing the last value over a partition.
RESPECT NULLS - Respect null values in the dataset when computing last value over a partition.
OVER ( [?partition_by_clause?]?order_by_clause?[?rows_range_clause?] )
The?partition_by_clause?divides the result set produced by the FROM clause into partitions to which the function is applied. If not specified, the function treats all rows of the query result set as a single group.
The?order_by_clause?determines the order of the data before the function is applied. The?order_by_clause?is required.
The?rows_range_clause?further limits the rows within the partition by specifying start and end points.
5.???LEAD
Accesses data from a subsequent row in the same result set without the use of a self-join starting with SQL Server 2012 (11.x). LEAD provides access to a row at a given physical offset that follows the current row. Use this analytic function in a SELECT statement to compare values in the current row with values in a following row.
Syntax
LEAD ( scalar_expression [ ,offset ] , [ default ] ) OVER ( [ partition_by_clause ] order_by_clause )
Arguments
scalar_expression
The value to be returned based on the specified offset. It is an expression of any type that returns a single (scalar) value.?scalar_expression?cannot be an analytic function.
Offset
The number of rows forward from the current row from which to obtain a value. If not specified, the default is 1.?offset?can be a column, subquery, or other expression that evaluates to a positive integer or can be implicitly converted to?bigint.?offset?cannot be a negative value or an analytic function.
Default
The value to return when?offset?is beyond the scope of the partition. If a default value is not specified, NULL is returned.?default?can be a column, subquery, or other expression, but it cannot be an analytic function.?default?must be type-compatible with?scalar_expression.
OVER?(?[?partition_by_clause?]?order_by_clause)
partition_by_clause?divides the result set produced by the FROM clause into partitions to which the function is applied. If not specified, the function treats all rows of the query result set as a single group.?order_by_clause?determines the order of the data before the function is applied. When?partition_by_clause?is specified, it determines the order of the data in each partition. The?order_by_clause?is required