Advanced clustering for marketing with Tableau

Advanced clustering for marketing with Tableau

Last week, one of my customers was asking me how they could cluster their customers, accounts or contacts by any insight they are seeing within their data for marketing purposes, like sending out promotions, inviting to events or? recommending more products. They wanted to address the right people, the right customers, the right accounts or the right contacts with the right message, but of course they wanted to keep it easy and find the basis for this task with data analyzed in Tableau.

Tableau's Integrated Feature

As a long-term solution consultant, my first answer was that there are big software suites that solve this problem in quite sophisticated ways, for example Salesforce Marketing Cloud and that Tableau can work together with most of these solutions in some kind, but for a first approach, this seemed to be quite too much for this customer.

As I like to keep it easy and simple I was pointing this customer to Tableau’s integrated feature to cluster data like explained here. A simple drag and drop with your mouse and? you get your data clustered without any efforts, but does this really help me here?



Limitations of Tableau's Clustering

I am using the superstore dataset here and yes, I get my four clusters for customers based on sales, discount and profit, but I'm quite limited in what I can do with this inside that I get here. It's very visual, but I can not download the data with the clusters assigned and I have no influence on the actual algorithm being involved here. Also, if I want to know how the algorithm comes to the clusters, I am basically stuck with some statistical information that Tableau is giving me. And actually from a business standpoint, this is quite useless.

Examples for Real-World Usage of Clustering

What I could really need would be to have a proper way to export the customer IDs together with their assigned cluster or to have defined limits for the different factors going into the clustering like sales, discount, profit so I can use intervals in my marketing system to filter to the right group of customers to be addressed. But this unfortunately does not come with Tableau - quite clear, it's a general purpose analytics solution and not a marketing suite.

But here's the thing I actually can bring in a self defined algorithm for the clustering via the Tableau analytics extension, and Python as a language. This way, I can use the latest developments, adjust to the situation given and improve the solution over time.

To keep the thread of thinking and writing, I keep what Tableau is using, so I select the K-Means algorithm to cluster the superstore customers - the same algorithm as Tableau is using internally. Quite naturally, as I am not a daily Python coder, I used ChatGPT to create the code for the Tableau calculation.

Ok, I am using the same algorithm as before, but the value out of this approach is that I get the cluster as a real datapoint, thus I can export the complete table of customer IDs including the assigned cluster. As a side effect, I can easily create a parameter with a slider to play around with the number of clusters.

Exploring Other Clustering Algorithms

Now I can change the code to another clustering algorithm like DBScan (Density-Based Spatial Clustering of Applications with Noise). This algorithm does not allow a defined number of clusters, but instead you can play around with the statistical parameters and immediately see the visual result.


For the interested reader, I found this description of the difference between K-Means and DBSCAN quite readable:

"K-Means requires you to specify the number of groups in advance and works best when the groups are roughly spherical and of similar size. It's like sorting marbles into a set number of jars based on their color. On the other hand, DBSCAN doesn't need you to specify the number of groups and can find clusters of any shape, even identifying outliers that don't fit into any group. It's more like finding natural clusters of stars in the sky, where some stars might be isolated and not part of any constellation."...but of course this is not our point here.

How About Intervals For My Marketing Challenge?

We still have this issue: we can not define any interval definitions to filter the customers and also maybe re-use these filters later. You might say "Why re-use the same filters anywhere else, when I have a list of customer IDs and clusters?" and yes, I see the point, but my experience tells me that people want to understand what's going on and defining numerical intervals would be something where people can understand that they are adressing e.g. the customers with high sales, medium average discounts and low profit, because they would maybe increase overall profit while risking a bit of unprofitable revenue.

Let's play around quickly with other algorithms - AI is our friend as we can simply ask for other algorithms and get the code immediately. 'Fuzzy C-means' does not force a customer into a single cluster, but instead defines how confident a customer belongs to a certain group. You can see the result best when changing the scatterplot a tiny bit.

Gaussian Mixture Models can identify clusters that come in various shapes and sizes, and can even handle clusters that are tilted or stretched in different directions, by fitting a bell curve to each group. This means that customers are assigned to clusters based on probabilities, giving a more flexible and accurate grouping that reflects the natural variations in the data.

This algorithm can also be combined with the desired interval bins for sales, profit and discount. I extended this approach a bit and used it in a script node in Tableau Prep Builder now, but the result generally is similar to Tableau Desktop: I am using a Python script to assign customers to clusters, but now I am splitting up the numerical ranges for sales, discount and profit into easy-to-read buckets like 'low', 'medium', 'high' and 'very high'. The advantage is that behind these labels, I can find exact numerical intervals that could define filter criteria for the list of customers we want to address. The GMM algorithm defines a 'Dominant Cluster', but also defines probabilities for each customer with respect to the other clusters. This way, the result is far more flexible compared to the approaches before. (Please just ping me, if you are interested in getting my code sample and the Tableau Prep Flow.)

I am using Tableau to inspect the resulting cluster assignments and found this visualization quite useful - this might be a very good starting point to what I really wanted to find.

This viz is a kind of 'buffet' where I can select the customers in the different cells of the matrix and then hand over to any marketing solution.

For example, you can easily see that selecting customers with 'low' sales and 'low' to 'medium' profit is the largest part of the orange cluster number 2. Finding the necessary numerical ranges for sales and profit is easy and can be done with a box-and-whisker plot in Tableau.

To wrap up the idea of using the clustered customers in some external application, I want to point to some approaches that Tableau allows:

  • Tableau allows calling so-called flows in Salesforce easily - within the flows, all kinds of things might happen to the data. The idea would be to select one or more customer marks in Tableau and issue the flow, handing over the selected customer IDs and e.g. sending an email directly, creating a service case or a task for a sales rep. This capability named 'Tableau External Actions' is not used too often and I tried it myself here.
  • If you are working with Salesforce Data Cloud, you might directly create the segment from your Tableau viz using this approach: https://help.tableau.com/current/pro/desktop/en-us/segments.htm, which provides the possibility to easily re-use the segment within a single data lakehouse which is open to a number of systems, can be analyzed directly with Tableau and of course has a number of advantages combined with the rest of the Salesforce universe.
  • If your target system allows using something like a REST API, you might use Tableau's worksheet or dashboard actions to call a parametrized URL like explained here.
  • If there is nothing else, you can right-click on the selected customer marks and export the data associated to a CSV file and use this as a bridge to external systems. Of course this is a bit manual, but still a real-world solution.

Dr. Nicholas Heck-Grossek

I was into data before it was big | BI Modernization @Deloitte

1 个月

Awesome stuff! Oliver Biederbeck Jens Haschkamp we talked about Segmenting in Tableau and pushing this to Data Cloud multiple times, you remember?

Peter N?gele

Principal Solution Engineer bei Salesforce

1 个月

Great work, Dirk!!!

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了