Excel LAMBDA Spotlight: UNPIVOT

Excel LAMBDA Spotlight: UNPIVOT

CREDITS

In this article I'll look at a LAMBDA function called UNPIVOT. This one was written by?Erik Oehm, creator of the amazing Excel Robot.

This article is a continuation of last week's LAMBDA spotlight, which covered the CROSSJOIN function. UNPIVOT uses CROSSJOIN, so if you want a full understanding of both, you can read the CROSSJOIN article first.


OVERVIEW

In Erik's words, UNPIVOT, "given a table range with headers and array of header names, unpivots the specified columns in place, optionally removing any blank entries."

In the made-up example below, I've used UNPIVOT in cell I2 to select a cross-tabulated table of monthly sales by product and category. I've the whole table to the first argument (B2:G12), and the headers of the columns I want to unpivot to the second argument (D2:G2).

No alt text provided for this image

Both arguments can also be a Table, a named range or an array, whether produced by another function, or a typed array.

Hopefully you can instantly see the utility of UNPIVOT. Let's unpack how it works.


FUNCTION DEFINITION

/
Name: Unpivot Table (UNPIVOT)


Description: Given a table range with headers and array of header names, unpivots the 
? ?specified columns in place, optionally removing any blank entries.


Written By: Erik Oehm / Excel Robot (@ExcelRobot)


Category: Array
*/
UNPIVOT = LAMBDA(table, columns_to_unpivot, [attribute_name], [value_name], [remove_blanks],
? ? LET(
? ? ? ? _AttributeLabel, IF(ISOMITTED(attribute_name), "Attribute", attribute_name),
? ? ? ? _ValueLabel, IF(ISOMITTED(value_name), "Value", value_name),
? ? ? ? _FirstColumnToUnpivot, MATCH(INDEX(columns_to_unpivot, , 1), INDEX(table, 1, ), 0),
? ? ? ? _UnpivotColumnCount, COLUMNS(columns_to_unpivot),
? ? ? ? _ColumnNumbers, SEQUENCE(1, COLUMNS(table)),
? ? ? ? _IncludeColumns, (_ColumnNumbers >= _FirstColumnToUnpivot) *
? ? ? ? ? ? (_ColumnNumbers < _FirstColumnToUnpivot + _UnpivotColumnCount),
? ? ? ? _UnpivotColumns, FILTER(_ColumnNumbers, _IncludeColumns),
? ? ? ? _OtherColumns, FILTER(_ColumnNumbers, NOT(_IncludeColumns)),
? ? ? ? _FullOuterJoin, CROSSJOIN(
? ? ? ? ? ? CHOOSECOLS(table, _OtherColumns),
? ? ? ? ? ? VSTACK(_AttributeLabel, TRANSPOSE(columns_to_unpivot)),
? ? ? ? ? ? TRUE
? ? ? ? ),
? ? ? ? _WithValues, HSTACK(
? ? ? ? ? ? _FullOuterJoin,
? ? ? ? ? ? VSTACK(_ValueLabel, TOCOL(DROP(CHOOSECOLS(table, _UnpivotColumns), 1)))
? ? ? ? ),
? ? ? ? _RemoveBlanks, IF(
? ? ? ? ? ? OR(ISOMITTED(remove_blanks), remove_blanks),
? ? ? ? ? ? FILTER(_WithValues, INDEX(_WithValues, , COLUMNS(_WithValues)) <> ""),
? ? ? ? ? ? IF(_WithValues = "", "", _WithValues)
? ? ? ? ),
? ? ? ? _ColumnOrder, LET(
? ? ? ? ? ? n, COLUMNS(_RemoveBlanks),
? ? ? ? ? ? s, SEQUENCE(1, n),
? ? ? ? ? ? IFS(
? ? ? ? ? ? ? ? s < _FirstColumnToUnpivot,
? ? ? ? ? ? ? ? s,
? ? ? ? ? ? ? ? s < _FirstColumnToUnpivot + 2,
? ? ? ? ? ? ? ? s + n - _FirstColumnToUnpivot - 1,
? ? ? ? ? ? ? ? TRUE,
? ? ? ? ? ? ? ? s - 2
? ? ? ? ? ? )
? ? ? ? ),
? ? ? ? _ReorderColumns, CHOOSECOLS(_RemoveBlanks, _ColumnOrder),
? ? ? ? _ReorderColumns
? ? )
);*        

UNPIVOT takes five parameters:

  1. table - a range, named range, Table or array to unpivot
  2. columns_to_unpivot - a range, named range, Table column or array of column headers in table
  3. [attribute_name] - OPTIONAL - the column name to give to the unpivoted column headers. The default value is "Attribute"
  4. [value_name] - OPTIONAL - the column name to give to the unpivoted values. The default value is "Value"
  5. [remove_blanks] - OPTIONAL - whether or not to remove rows with blank values in the output. A blank value in the output would happen if you unpivoted a table with an empty cell. The default value is TRUE


BREAKDOWN

The function starts by defining LET variables:

? ? LET(
? ? ? ? _AttributeLabel, IF(ISOMITTED(attribute_name), "Attribute", attribute_name),
? ? ? ? _ValueLabel, IF(ISOMITTED(value_name), "Value", value_name),        

_AttributeLabel and _ValueLabel use the ISOMITTED function to assign a default value in case either attribute_name or value_name are omitted.

This is such a common formula, I've added a feedback request to Microsoft to add an IFOMITTED function, which would shorten the definition of _AttributeLabel to this:

_AttributeLabel, IFOMITTED(attribute_name, "Attribute"),        

Please vote for the idea here.

Next:


_FirstColumnToUnpivot, MATCH(INDEX(columns_to_unpivot, , 1), INDEX(table, 1, ), 0)
_UnpivotColumnCount, COLUMNS(columns_to_unpivot),
_ColumnNumbers, SEQUENCE(1, COLUMNS(table)),
_IncludeColumns, (_ColumnNumbers >= _FirstColumnToUnpivot) *
  (_ColumnNumbers < _FirstColumnToUnpivot + _UnpivotColumnCount),        

_FirstColumnToUnpivot takes the first value in the columns_to_unpivot argument and searches for it in the first row of table, then returns the position of the column in the table. This is used to separate what can be considered the 'grouping' columns (which aren't unpivoted) and the 'value' columns (which are).

_UnpivotColumnCount is the count of the columns to unpivot. Saving this simple function call like this is an indication it will be used in several places later on.

Similarly, _ColumnNumbers creates a row of integers representing the relative column index of each column in the table. In the example above, this is {1,2,3,4,5,6}.

_IncludeColumns is an interesting calculation. To illustrate this calculation, consider the example where I only want to unpivot February and March:

No alt text provided for this image

You can see that January and April have not been unpivoted. In this example:

_ColumnNumbers = {1,2,3,4,5,6}
_FirstColumnToUnpivot = 4
_UnpivotColumnCount = 2
_ColumnNumbers >= _FirstColumnToUnpivot 
    = { 1>=4, 2>=4, 3>=4,4>=4,5>=4,6>=4} 
    = {FALSE,FALSE,FALSE,TRUE,TRUE,TRUE}
_ColumnNumbers < _FirstColumnToUnpivot + _UnpivotColumnCount
    = {1,2,3,4,5,6} < 4 + 2
    = { 1<6, 2<6, 3<6, 4<6, 5<6, 6<6 }
    = {TRUE,TRUE,TRUE,TRUE,TRUE,FALSE}
(_ColumnNumbers >= _FirstColumnToUnpivot) *
  (_ColumnNumbers < _FirstColumnToUnpivot + _UnpivotColumnCount)
    = {FALSE,FALSE,FALSE,TRUE,TRUE, TRUE} * 
      { TRUE, TRUE, TRUE,TRUE,TRUE,FALSE}
    = {FALSE,FALSE,FALSE,TRUE,TRUE,FALSE}
    = {0,0,0,1,1,0}        

I know it seems long-winded, but I hope it makes things crystal clear. The calculation is creating a boolean array where columns to be unpivoted are TRUE, and columns which won't be unpivoted are FALSE.

There is a limitation of this approach. This assumes that the columns to unpivot are contiguous in the table. If I try to unpivot only January and March, the function unpivots January and February:

No alt text provided for this image

I'd hesitate to call this a bug, as I've recall speaking to Erik about this earlier this year. Hence, it's a known limitation. Realistically, cross-tabulated data in Excel usually involves contiguous values in a domain across columns. It's not likely that I would only want to unpivot January and March, for example.

Nevertheless, Excel is famous for offering many ways to achieve the same goal. One other way we might consider for _IncludeColumns is:

? ? ? ? _IncludeColumns, BYCOL(
? ? ? ? ? ? TAKE(table, 1),
? ? ? ? ? ? LAMBDA(c, NOT(ISERROR(XMATCH(c, columns_to_unpivot))))
? ? ? ? ),        

That is to say: iterate through each cell in the first row of the table (i.e. the header row). For each cell value, try to match it with the list of columns_to_unpivot . If there's a match, NOT(ISERROR( returns TRUE and FALSE if there isn't a match.

In fact, this "check for an item in a list" function is pretty common, and we can create a curried "list.CONTAINS" LAMBDA which can be used as shown.

list.CONTAINS = LAMBDA(array, 
? ? LAMBDA(item, 
? ? ? ? NOT(ISERROR(XMATCH(item, array)))
? ? )
);

_IncludeColumns, BYCOL(
    TAKE(table, 1),
    list.CONTAINS(columns_to_unpivot)
),        

That's a variant of a function in my list module, which you're welcome to explore.

Getting back to UNPIVOT, next we have:

_UnpivotColumns, FILTER(_ColumnNumbers, _IncludeColumns)
_OtherColumns, FILTER(_ColumnNumbers, NOT(_IncludeColumns)),,        

These two variables are just the column indices of the columns to unpivot and those of the other columns.

So, in the original example unpivoting all four months,

_UnpivotColumns = FILTER(_ColumnNumbers, _IncludeColumns)
                = FILTER({1,2,3,4,5,6}, {0,0,1,1,1,1})
                = {3,4,5,6}
_OtherColumns   = FILTER(_ColumnNumbers, NOT(_IncludeColumns))
                = FILTER({1,2,3,4,5,6}, {1,1,0,0,0,0})
                = {1,2}        

Next is where we use the CROSSJOIN:

? ? ? ? _FullOuterJoin, CROSSJOIN(
? ? ? ? ? ? CHOOSECOLS(table, _OtherColumns),
? ? ? ? ? ? VSTACK(_AttributeLabel, TRANSPOSE(columns_to_unpivot)),
? ? ? ? ? ? TRUE
? ? ? ? ),        

I won't go into details (please read the article linked above if you'd like to understand how it works), but put simply: cross join the columns that won't be unpivoted with the vertical stack of the attribute label and the transpose of the columns to unpivot array (i.e. {"Attribute";"January";"February";"March";"April} ). The result of _FullOuterJoin looks like this:

No alt text provided for this image

Next we add the values to the cross-joined data:

? ? ? ? _WithValues, HSTACK(
? ? ? ? ? ? _FullOuterJoin,
? ? ? ? ? ? VSTACK(_ValueLabel, TOCOL(DROP(CHOOSECOLS(table, _UnpivotColumns), 1)))
? ? ? ? ),        

Take the result from _FullOuterJoin (shown above) and to its right, drop the first row from the columns to be unpivoted, convert the result into a single column, then finally stack the value label on the top. The result:

No alt text provided for this image

If the function call has supplied TRUE to the, or omitted the, remove_blanks argument, then some filtering happens:

? ? ? ? _RemoveBlanks, IF(
? ? ? ? ? ? OR(ISOMITTED(remove_blanks), remove_blanks),
? ? ? ? ? ? FILTER(_WithValues, INDEX(_WithValues, , COLUMNS(_WithValues)) <> ""),
? ? ? ? ? ? IF(_WithValues = "", "", _WithValues)
? ? ? ? ),        

In English: if either the remove_blanks argument is omitted, or remove_blanks is TRUE, then filter the _WithValues result (shown above), for those rows in the right-most column which are blank. Otherwise, just set _RemoveBlanks to _WithValues.

We're getting close to the end now.

? ? ? ? _ColumnOrder, LET(
? ? ? ? ? ? n, COLUMNS(_RemoveBlanks),
? ? ? ? ? ? s, SEQUENCE(1, n),
? ? ? ? ? ? IFS(
? ? ? ? ? ? ? ? s < _FirstColumnToUnpivot,
? ? ? ? ? ? ? ? s,
? ? ? ? ? ? ? ? s < _FirstColumnToUnpivot + 2,
? ? ? ? ? ? ? ? s + n - _FirstColumnToUnpivot - 1,
? ? ? ? ? ? ? ? TRUE,
? ? ? ? ? ? ? ? s - 2
? ? ? ? ? ? )
? ? ? ? ),
? ? ? ? _ReorderColumns, CHOOSECOLS(_RemoveBlanks, _ColumnOrder),
? ? ? ? _ReorderColumns
? ? )
);        

_ColumnOrder is creating an array of integers which will be used to produce the output array. Consider the example of unpivoting only January and February:

No alt text provided for this image

The non-unpivoted columns are first, followed by Attribute and Value. In terms of the original table, the column order is 1,2,5,6,3,4

The important thing to remember about using IFS with an array is that it will test each element of the array against a condition and pass them separately through each condition until it returns TRUE for that element.

So, because the first element of s is 1, it returns TRUE from condition1 and is not passed to condition2. Similarly, because element 3 of s is 3, it returns FALSE from condition1 and is then passed for testing against condition2, from which it returns TRUE.

The net result can be visualized with this image:

No alt text provided for this image

I hope this makes sense. I think this is the trickiest part of this function to explain.

After calculating the column orders, we use them to reorder the columns:

? ? ? ? _ReorderColumns, CHOOSECOLS(_RemoveBlanks, _ColumnOrder),
? ? ? ? _ReorderColumns
? ? )
);        

And the reordered columns are returned as the result of UNPIVOT.


That's it for now. I hope you enjoyed this article.

All credit for the function described in this article and in the CROSSJOIN article goes to?Erik Oehm.


If this article has been interesting, please leave a reaction, a comment, or preferably both, and share the article with your network.

If you have a suggestion for a function which you'd like put under the spotlight, please message me.

Thanks for reading!

when column header more than two or more, formula error

回复
Peter Bartholomew

Technical Fellow at NAFEMS

1 年

Impressive! Another proof that "He who thinks life is simple, hasn't understood the problem. My formula for unpivoting was UPλ(item, mask) = LET( ? ? itemArr, IF(mask, item, NA()), ? ? field, TOCOL(itemArr, 3, TRUE), ? ? field ) I call the module 3 times, once for the values and once for each of the categories and attributes which are broadcast over the mask. That left me everything to do in the calling Lambda, but it did allow me to filter by category, attribute or value fields as required. For example = LET( ??values,??DROP(table,,2), ??category, TAKE(table,,2), ??recordId, SEQUENCE(ROWS(table)), ??month,??DROP(table[#Headers],,2), ??mask,???(values > 50)*(TAKE(category,,1)="CategoryA"), ??UPValues, UPλ(values, mask), ??recdIdx,?UPλ(recordId, mask), ??mnthAttr, UPλ(month, mask), ??HSTACK(CHOOSEROWS(category, recdIdx), mnthAttr, UPValues) ?) returned the values from CategoryA that exceeded the 50.00 threshold. Not so much a utility as a knife and fork job!

Eduardo Amaral

Ph.D. in Economics

1 年

That's great, Owen! Super useful! I'm really looking forward to a PIVOT's spotlight.

Steven Nernberg, MBA

Accounting Consultant | Excel ?? | SQL ?? | Python ?? |

1 年

A lot easier in Power Query ??

要查看或添加评论,请登录

Owen Price的更多文章

  • Excel LAMBDA Spotlight: Bisected Map with BMAPλ

    Excel LAMBDA Spotlight: Bisected Map with BMAPλ

    In this article I'll look at a LAMBDA function called BMAPλ, which was written by Peter Bartholomew. You can grab the…

    8 条评论
  • Excel LAMBDA Spotlight: Sudoku Solver!

    Excel LAMBDA Spotlight: Sudoku Solver!

    In this article I'll look at a LAMBDA function called solver, which was recommended to me by Bhavya Gupta and written…

    16 条评论
  • Excel LAMBDA Spotlight: SumColumnsλ

    Excel LAMBDA Spotlight: SumColumnsλ

    In this article I'll look at a LAMBDA function called SumColumnsλ, written by Craig Hatmaker. Craig is very active in…

    4 条评论
  • Excel LAMBDA Spotlight: INFLECTIONPOINTS

    Excel LAMBDA Spotlight: INFLECTIONPOINTS

    In this article I'll look at a LAMBDA function called INFLECTIONPOINTS. This is a function I wrote last year to quickly…

    8 条评论
  • Excel LAMBDA Spotlight: SHOWCLOCK

    Excel LAMBDA Spotlight: SHOWCLOCK

    In this article I'll look at a LAMBDA function called SHOWCLOCK. I don't know the full name of the person who wrote…

    8 条评论
  • Excel LAMBDA Spotlight: CROSSJOIN

    Excel LAMBDA Spotlight: CROSSJOIN

    CREDITS The CROSSJOIN LAMBDA function is the product of Erik Oehm, creator of the amazing Excel Robot. If you haven't…

    10 条评论
  • LAMBDA spotlight: Text.DropSliceBetween

    LAMBDA spotlight: Text.DropSliceBetween

    CREDITS This article will look at a single LAMBDA function called Text.DropSliceBetween.

    7 条评论
  • LAMBDA spotlight: A.XMATCH.ROWS

    LAMBDA spotlight: A.XMATCH.ROWS

    CREDITS Today I'm going to try something a little different. This article will throw a spotlight on a function that I…

    19 条评论

社区洞察

其他会员也浏览了