Contradicting Data can be a Data Quality issue, but Always?
Alban Gér?me
Founder, SaaS Pimp and Automation Expert, Intercontinental Speaker. Not a Data Analyst, not a Web Analyst, not a Web Developer, not a Front-end Developer, not a Back-end Developer.
Last week, Jeremy Wyatt , a recruiter who placed me in a job long ago, shared the story of an FTSE 100 company annual leadership weekend. As with many such events, the leadership had an agenda. These leaders identified their biggest blocker: Data Quality.
Data Quality, a convenient excuse to ignore inconvenient news?
I did not attend this event, but what could be the issue? We all know the adage too well: garbage in, garbage out. Bad data leads to bad decisions. Data can be of bad quality, and we can all agree. What makes leadership good judges of data quality, though? Leadership will have beliefs and opinions; when the data aligns and confirms them, we can believe that data quality is good. However, data does not care about beliefs and opinions, so what happens when the data contradicts the leadership? The leadership will challenge data quality and how we collect it.
In Digital Analytics, one of the arguments we face, like zombies that won't die once and for all, is how the orders our tools report never match the backend. Another one is how adding the daily unique visitors never matches the weekly unique visitors for a given week. Two different tools may track the same web pages and won't report a matching number of visits.
Visitors cannot opt out of seeing their orders tracked in a backend system, but any client-side web technology will depend on a cookie consent banner, and your clients will have a choice. Sometimes, the browser or their antivirus will decide for them, often without their knowledge.
Some mischievous visitors start their visits just before midnight and end them just after, thus counting as unique visitors for two days in a row, as if on purpose. Some tools consider that visits end at midnight, and there is significant variability in how much idle time marks the end of a visit. Unique visitors measure unique browsers and does not follow people across devices. When they clear their cookies, they count as new people.
In Digital Analytics, we know that data can be different from the standards required in accounting. Trust does not have to be a binary expression, complete or lacking. In our field, an 80% match with some backend system is good enough as long as our trend tracks the trend from whatever more trustworthy system affords.
Digital Analytics tools may never reach the precision leadership claim they require to trust our data and recommendations. All the focus on creating ways to recognise people across browsers and devices may address some of the critics, but it may be an exercise in futility. Even if our data were impeccable, leadership would still question its quality if it challenged their beliefs and opinions.
Disruption risk was overblown
There was a time when consultancies tried to sell data to make bias-free business decisions and were laughed out of the premises. The advent of quant trading showed that it could work. No human being or team can trade at the speed the investment banks trade now. But where speed is less important, leaders must project certainty, absolute faith, and confidence in their vision. Many founders left an organisation where they were part of the middle management because their beliefs clashed with the CEO's vision. Now that they are CEO, are they supposed to let data inform their strategy, never mind driving it?
领英推荐
The same consultancies then exploited the demise of Kodak and Blockbuster as signs of impending doom for more companies to go bankrupt. Embracing data would be the way to avoid the same fate. For years, that narrative prevailed and gained traction until another view emerged. What if Kodak and Blockbuster were nothing more than poorly run businesses? In hindsight, how many large companies have joined Kodak and Blockbuster as victims of disruption in the last 20 years? Not many.
The emergence of digital cameras and the price drops that ensued, Netflix, all may have accelerated the decline, but did they cause it? Kodak has enjoyed a surprising comeback in the past five years and now struggles to hire as people are rediscovering the quality of analogue photography, much like how vinyl records sell more than CDs.
An article submitted as a letter by an MIT Sloane Management Review subscriber, published in their winter 2018 volume 59, number 2, surprised me when the disruption narrative was still at cruise speed. The reader's letter reacted to an essay by Paul Michelman, the editor-in-chief, published in the same magazine in the previous issue. The subscriber lamented an "obsession with radical transformation, chaotic disruption, and lust for digital tools."
In his view, this is a knee-jerk reaction and a poor substitute for monitoring the competition and customers and adjusting their strategy. Companies that do that won't have anything to fear of disruption; only the ones asleep at the wheel should. If you run your business well, there's nothing to see here; it's probably too late if you don't. Disruption was never a trend; a few outliers were happening in close succession, and people were biased toward seeing patterns where there were none.
My theory about poor data quality is that this company has excellent data quality, which contradicts the leadership's beliefs and opinions. They can't accept being wrong because it would undermine their leadership, so it's a data quality issue. That company may be an FTSE 100 company now, but it will face disruption risk if it ignores good data and will become an outlier. Rather than projecting a solid vision, it's better to project that you can adjust your strategy to external forces such as competition, regulators and customers. Not everything is within your control, but your ability to adapt, like water flowing around rocks but aiming for the same direction, is.
Data does not care about beliefs and opinions. It is not a brand of paint that will give your strategy a scientific varnish, but rather the opposite: something that will strip all the wallpaper and paint on your walls and show you their genuine state. It won't be perfect; some stubborn bits will remain on the wall, but it will be close enough to help make better decisions.
When the internet sees the rise of alternative news and some channels dwarf traditional media and challenge the mainstream narrative, it is right to worry about fake news, but systematically? Some of it may be right, even a lot, but it may also be inconvenient and embarrassing. As a business leader, data contradicting your beliefs may be a genuine data quality issue, but always? Instead of asking oneself whether misaligned data is wrong as the default stance, ask yourself what if it were correct.
#MeasureCamp #DigitalAnalytics #DataQuality #WAWCPH #CBUSDAW
For more articles like this, subscribe to my newsletter,?The Ternary Operator, and?follow me on LinkedIn?or X: @albangerome
Founder, SaaS Pimp and Automation Expert, Intercontinental Speaker. Not a Data Analyst, not a Web Analyst, not a Web Developer, not a Front-end Developer, not a Back-end Developer.
8 个月https://sloanreview.mit.edu/article/will-skill-and-velocity-survival-skills-for-a-digital-world/
Founder, SaaS Pimp and Automation Expert, Intercontinental Speaker. Not a Data Analyst, not a Web Analyst, not a Web Developer, not a Front-end Developer, not a Back-end Developer.
8 个月https://sloanreview.mit.edu/article/is-the-threat-of-digital-disruption-overhyped/
Executive Search ?? Data & AI ?? Data Science | Data Engineering | Artificial Intelligence | Data Architecture | Machine Learning. I was hiring Data Scientists before Data Scientist's were called Data Scientists
8 个月Thanks for the mention Alban Gér?me