Big data, big noise

Big data, big noise

We tend to believe that more data means more value. In some situations, this might be true, but in others it is exactly the other way around. The problem is that we don’t take the latter situation into account. As a result, we identify things that aren’t real, which as this piece explores, is dangerous.


Confusing noise for signal

I think nowadays most marketers would consider themselves ‘fact based’ marketers: facts instead of intuition guide their decision making. But all too often, even though they might be measuring all sorts of stuff, their assumptions of how things work hampers smart decision making. One of those errors that I think is particularly fascinating is the tendency to ignore the influence of chance. Although a very well-documented error, even trained intelligence professionals tend to confuse noise (chance, randomness) for signal (logic, explanation).

Take for example A/B testing. Two alternative ways of presenting something to a consumer are tested in a real (usually online) environment. The alternative that generates the most clicks, or has the highest conversion rate, wins. Most tools that facilitate A/B testing will flag a winner as soon as the test yields a ‘statistically significant’ result.

Many winners flagged with his approach, however, will not give you any uplift in clicks or conversion. Simply because most tools have simplified the criteria for a proper test. A proper test would be 1) setting a minimal effect size in advance; 2) determining a sample size that has the power to ‘prove’ the minimal effect with a certain probability; and then 3) letting the experiment run until it is finished. For many, this is too much trouble, or worse, they are not aware these are minimum requirements to run a test.


Looking for confirmation

The A/B test example is an illustration of a serious problem in marketing intelligence: often the criteria we use are simplified or flawed. The solution in this case is straightforward: you should set up the experiment as prescribed in any solid textbook about testing and experimenting. Just as importantly, don’t let an automated tool make decisions for you, without having a firm understanding of its underlying assumptions and criteria.

Even if you are aware of all the assumptions and criteria, your mind can still play tricks on you. The other day I had a conversation with a skilled and experienced analyst who was comparing the conversion rates of two groups. One group got a specific treatment that should raise conversion, the other one was a control group. The analysis showed no difference between the two.

The analyst’s first reaction was a natural one: ‘This can’t be right… It surely must work… I must have overlooked something…’ So, he turned the data upside down and tried to tweak the data, in an attempt to get the expected result. In this case, the data didn’t confess, and the analyst felt like he lost the battle.

I have seen many versions of the example above. It illustrates another persistent issue: our expectations affect what we see, and how we see it. If the data doesn’t live up to our expectations, we usually don’t give in very easily, and start tweaking until it restores our view of how we think the world works. ‘Motivated reasoning’ is a well-documented phenomenon that tricks our minds much more than we realize at first sight. But when we are really honest with ourselves, I think we can all identify with the CMO of a Dutch corporate that recently acknowledged to me that ‘we are a data-driven company, as long as the data confirms what we believe’.

?

Blowing up variation

Finally, our assumptions about how the world works can lead us astray. For example, as marketers and analysts are often focused on finding variation in their numbers, they use various tools that do exactly that. Indexing, significance testing, brand mapping and segmentation algorithms (just to name a few) tend to exaggerate the variation that’s hidden in the numbers. Hence, it doesn’t take much effort to see a bit of signal in a sea of noise.

Take brand perceptions for example. If you measure how various brands in the category are perceived, it often looks like a lot is going on. An example is given in Exhibit 1. As you can see, Brand 3 three scores highest on ‘Always lowest price’. This is not (relative) differentiation, however. To see the real differentiation, we need to acknowledge the patterns that are always there in this type of data.


Exhibit 1: Raw brand scores (% agree & totally agree)

No alt text provided for this image

First, bigger brands get higher scores on all attributes, simply because they are bought more often, and even for non-customers bigger brands are more visible. Second, some attributes are more associated with the category than others, and therefore get higher scores. Consumers will probably associate their favourite brand of yoghurt with ‘good quality’ or ‘value for money’, but they will be unlikely to think of this product as ‘innovative’ or ‘unique’.

It is important to take these patterns into account when we try to make sense of brand perception data. Exhibit 2 shows you how to do this. The first table shows the same numbers as in Exhibit 1, but they are now sorted horizontally (so bigger brands go to the left) and vertically (so higher attribute scores go up). The heat map helps to see the patterns: bigger brands have higher numbers for all attributes (brand effect) and some attributes have higher numbers for all brands (attribute effect).

?

Exhibit 2: Raw, expected & relative scores

No alt text provided for this image

?

?Now here’s the trick: by using the row and column averages, you can take both the brand and attribute effect into account. You can calculate the numbers you would expect to get if there would only be differences in brands size and attribute prototypicality. E.g., for Brand 3 we would expect its value for ‘Preserves colours’ to be (32 x 33) / 21 (i.e., the row average times the column average divided by the grand mean). So, Brand 3 scores 6 percentage points higher on this attribute than we would expect given its relative size and given the relevance of this attribute.

As you can see in the rest of the table on the far right, the level of differentiation in this case is modest at best: by far most relative scores are less than plus or minus 5 percent points. Only 4 out of 75 brand x attribute combinations show differences larger than this. Whether these are meaningful is up for debate. The point is, again, when we dig in a bit, most of the variation we see is noise rather than signal.

?

Noise is a distraction

Long story short, as the amount of data and the number of data sources at our disposal increase, it is tempting to believe that our decisions will become better over time. But as the availability of data grows, so does the level of noise. If we don’t get better at separating the wheat from the chaff, the value of the ‘new oil’ will prove quite disappointing.


Note: original source: Eat Your Greens - Fact-based thinking to improve your brand's health, curated by Wiemer Snijders , published by APG 2018. This is an edited and shortened version.

?


Ari?n Breunis

Brand and Marketing Strategy | Training | Research | Head of | Interim | Freelance

1 年

Nice one Robert van Ossenbruggen. I’m personally quite fond of this analysis. One has to realize though that a lot of what can be found is the result of what you have put in there in the first place, i.e. the brands and the statements. Alternatively, you won’t find anything on what you didn’t put in. When would you say these analyses are most helpful?

回复
Ari?n Breunis

Brand and Marketing Strategy | Training | Research | Head of | Interim | Freelance

1 年
Bert Moore

Brand nurturer. Imaginative problem solver. Data juggler. Tech humanist. Gritty, sleeves-rolled-up entrepreneur.

1 年

Thank you Robert; for the somewhat protracted debate, i'd suggest your other article in Greens 'facts, frames and fantasies', is a useful reminder of how we can get our feet stuck in mental mud.

Anna O'Riordan

Providing more than a fraction of Marketing support | CMO | Marketing Director | Brand & Marketing Consultant | Co-founder esTeam

1 年

Food for thought indeed - thanks for sharing Robert

回复
Linda Jones

Philanthropy, fundraising, impact measurement and branding strategist driven by evidence-based practices to support the impact of Australia’s inspiring and integral for-purpose sector.

1 年

要查看或添加评论,请登录

Robert van Ossenbruggen的更多文章

社区洞察

其他会员也浏览了