Comparing To Lists
Roland Schubert
Entscheidungs- und ergebnisorienierte Datenanalyse für Finance, Marketing & HR - Von der Strategie bis zur Umsetzung | Planungs- und Analyseprozesse effizient gestalten, optimieren und automatisieren | Alteryx ACE
Long years ago, a DIY chain in Germany launched a discount campaign entitled "20% discount on all items - except pet food".??It might not have been such a good idea, but that's another story.
If they had Alteryx available, they might have analyzed more carefully in advance what impact the action could have on turnover and gross profit.?I assume they did not have this option.
However, we can simply calculate what our revenue would be if all prices except pet food were reduced by 20%. To do so,? we use a data set containing all products, prices and quantities sold. Thereby we assume that the quantities sold do not change as a result of the discounts (okay, that is very pessimistic).
Of course, this is done quickly. We use a Formula tool to calculate the new price on the condition that there is a 20% discount for all categories except pet food, and then calculate the revenue for the original and the discounted prices. The price calculation might look like this:
Alternatively, of course, it also works like this:
Or even like this:
Which of these options you use is a matter of personal taste - the second and third styles have slight advantages when it comes to performance, but I prefer the first to be a little clearer.
So the task is quickly done - or maybe not? After analyzing the potential effects of the discount campaign, it occurred to someone that an undifferentiated price reduction across almost all product groups is not such a good idea after all ... and as a result of this consideration we now get a list of selected product groups for which prices are to be adjusted.
The new task is perform the same calculation, but only reduce the prices for the following product groups by 20%:
All other prices will remain unchanged.
Option 1: IF/ELSEIF
?The above calculation can be easily extended by simply checking with IF/ELSEIF whether the respective items are part of one of the specified product groups and then reducing the price accordingly; if none of the conditions are met, the original price is used again. This then looks like this:
Of course, this can be done quickly with COPY/PASTE and a little adjustment. But now let's imagine that the creative colleague who plans the campaign comes up with the idea of changing the percentage. This has to be done in each line individually.
It is also possible that other product groups will be added - the formula will then be very long and confusing. It is already not very clear ...
领英推荐
Option 2: IN-Liste
But there are other options. We do not have to use a separate condition for each selection option, instead we can check whether there is a hit in a list of descriptions. For this purpose, there is the operator IN.
This looks like this:
All in all, it is much clearer. A positive side effect is that it is also faster, even significantly faster. In the example, we gain around 40 percent! So the modification was actually beneficial.
Option 3: REGEX
When it comes to determining whether a text appears in a list, we could of course think of RegEx as an option. And in fact, that is also possible here. The function REGEX_Match can be used to check the match:
If one of the entries from the list is found in the [CatDesc] column, the condition is fulfilled. Important: The single list entries must be separated with "|" in this case! This is also possible, but it takes longer - the time is about the same as with the IF/ELSEIF solution. "IN" would therefore be the better solution here.
A disadvantage remains, however, that the list is more or less "fixed" - with every change we have to adapt the formula, i.e. add the categories mentioned in the IN list (or the IF/ELSEIF or REGEX_Match). If we want to avoid this, we can also split the "identification of the affected categories" and the "calculation of the new price" into two steps.
Option 4: Split condition and calculation
In this approach, we use a Text Input Tool to maintain our list of categories (this could also be a file or table in combination with an Input Data Tool). We then use a Find Replace Tool to attach an indicator to the found categories and control the calculation using this indicator. So if the category appears in the "price list" in the category list, the Find Replace Tool simply appends the name again, the additional field is then (but only then) not NULL and the discount is calculated. Otherwise the normal price is kept again.?
The performance is a little worse compared to the IN solution, but it is still much better than the other approaches! In addition, we gain flexibility this way. And that's definitely worth it.
Option 5: Add more flexibility to calculation
But it can be even more flexible. We can also put the discount in the Text Input tool (or the corresponding input file) - then we can not only control which product groups are reduced in price, but also how high the discount should be.
Using the Find Replace tool, we no longer attach the product group name as an indicator, but the discount. And this is then used (if available) to calculate the new price that results when we multiply the old price by "1 - discount".
This leaves us with the exciting question of what the performance is like. And here we experience a surprise in the example - the solution just shown is not only the most flexible, but also the fastest!??
As we see again, there is (as very often) a whole range of possible solutions - and in addition to the options presented here, there are of course other approaches. Depending on the specific task, you have the choice - and it by no means always has to fall on option 5, even if it was the most flexible and fastest approach in our example. If we can assume that the selection list will be stable, the solution with an IN list certainly has the advantage that it is easy to understand and follow and could then be the "right" decision.