Why Big Data Doesn’t Always Mean It's Good Data
Dominic S.
Expert Patient | Cyclist for MS | Researcher | AI Advocate | Author | Reviewer | Lecturer | YouTube & Podcast Host
Or, why a Chief Data Officer may not be such a bad idea after all.
The Data Scientists out there will sigh as they feel that they have heard this a thousand times before. However, it is human beings are the issue. Numbers are just numbers, it is what we humans do with them that is the issue.
Very quickly then; this is the correlation and causation argument writ large.
Silly, isn't it? But on the face of it it can makes sense. This is exactly what is meant by post hoc ergo propter hoc and, silly examples aside, people see an apparent relationship like this and start on a very dangerous path as a result of ignorance. Merely acquiring more and more data points, a bigger data set, better hardware, software and human expertise to manipulate this data does not equal better results from the data.
Big data is great and powerful when it is clean and accurate data.
But pause and think: before plunging into the analysis and insight phase the cleaning and tidying phase – the often skipped past boring stuff – needs to be complete. The crazy outliers need to be identified, partial data from a one source needs to be investigated, in the case of human surveys the ‘don’t know’ answers may be coded out, and so on.
There are a variety of ways to allow the Data Scientists to do this, but the heart of the matter is that if they are not given the time, tools and budget to do this then you are back to the junk in, junk out scenario that affects everything to do with computers.
As humans we are programmed in ways that really hamper us. This is especially true when we are operating outside of our field of expertise or are very out of date regarding a subject matter area. Our brains crave clarity and simplicity, we avoid the unknown as that is where danger may lie. We want to make as smooth and as risk-free transit through life. Because of this the best and the brightest can suddenly become very credulous and succumb to deep-seated fear and prejudice. This propensity feeds the behaviour of some because they are told something, seize upon it and then happily transmit it to others as fact. The recipients believe it, often more so when it is passed to them by a person or source in whom additional credibility is invested. This was bad choices are made and then compounded through repetition.
I was struck when listening to an episode of The Infinite Monkey Cage – a science program on BBC Radio 4 – where anthropologists and evolutionary biologists were tearing their hair out at the traction an image we are all familiar with has gained. The evolution of man from ape to upright walking man is apparently a terribly inaccurate and misleading image. Apparently, it first appeared in a French school textbook back in the Fifties, resonated (which shows the power of a credible source and a good image) so much that it stuck and has been reproduced millions of times over. I had no idea how inaccurate it was and like to think that I am not very credulous. It goes to show the power of something that has been ever-present though. Few people except the experts challenge it, even now.
The iconic, contested and wholly inaccurate image
Bringing this to business: I feel for the person or team at Apple that had to brief Tim Cook and co that the earnings forecast had to be dramatically trimmed because the previous cash-cow of the iPhone was no longer selling as quickly. I do appreciate I have the benefit of hindsight regarding the following remark; the fact that people were hanging onto their devices for longer and were railing against the so-called planned obsolescence that many believed was being built in, coupled to the belief that the latest OS was designed to overwhelm older devices and yet without the latest OS then the functionality was going to limited henceforth, really upsets consumers. If that is combined with the increase in length of the service contracts we are all but forced to agree to by the network providers (here in the UK at any rate) in order to have the latest tech, subsidised by these growing contracts, I suspect this wouldn’t be such new news. Interestingly, it was spun as a particular Chinese market issue.
For this to happen we can see the clever PR operation swing into action. Apparently great PR relies so heavily on gut feelings and relationships that people overlook how incredible people are at computing very complex Big Data. Still far ahead of any computer. To whit: the entire slowdown has been pinned almost completely on the Chinese market. Something I find hard to swallow. I have no doubt it is a large component and very politically expedient, given the way China is portrayed in the US these days. The messaging seems to play heavily on the deterioration of relations between the US and China. The PR teams are operating on very thick and contextual data, nothing more. The human brains are the computers here. Either way, is apparently, not the fault of Apple… *coughs politely*
On the other hand, perhaps they knew of this trend and the feelings that underpinned it because they had excellent Big Data, had combined it with the Thick Data approach and insights of Anthropologists, Sociologists and Political Scientists who specialise in these fields, so they could synthesise the findings into usable data, and the real issue wasn’t knowing this but when to let the markets know? Hmmm.
Sadly, few large companies manage to meld their data very effectively and usually the larger they are the greater the disconnect between the boardroom and the customer, and the inadequacies of the information providers aren’t spotted soon enough.
What about the person responsible, or is there one? Challenging assumptions is often uncomfortable and often seen in an organisation as disruptive and potentially unwanted behaviour. A Chief Data Officer (CDO) ought to have both the support and power to ask the ‘who, what, when, where and why’ questions, relentlessly. In fact, if they aren’t querying the data they are to use for gaining insight and helping the other leaders to make the best informed decisions, they are probably falling short in their role.
Dominic Shadbolt
www.theproblemwithdata.com