Why "Unknown Unknowns" Matter

In 2002, Donald Rumsfield, then U.S. Secretary of Defense had famously remarked,"There are known knowns; there are things we know that we know. There are known unknowns; that is to say there are things that we now know we don’t know. But there are also unknown unknowns—there are things we do not know we don’t know. " While Rumsfield got a lot of flak for that remark, the concept of "unknown unknowns" is central to understanding the limits of analytics.

Unknown unknowns have the potential to create tremendous impact because no one anticipates the occurrence of these events. 9/11, the 2008 financial crisis, the Heartbleed bug are all examples of events that were unknown unknowns. The more complex a system is, the higher the likelihood of the existence of devastating unknown unknowns. The financial system, the internet, the climate are a few examples of systems that are so complex that humans can't comprehend all the myriad ways that various factors can affect it. It's ironic to note that this very fact makes the exercise of understanding unknown unknowns even more important.

Knowable Unknowns

Most of us work on transforming known unknowns into known knowns. Are customers buying Product A, is my machine failing and so on. By definition, the unknown unknowns cannot be known apriori. However, the unknown unknowns can further be divided into two groups - the "knowable unknowns", and the "truly unknown unknowns". 'Knowable unknowns" are the events that in hindsight could have been discovered. The "truly unknown unknowns" are events for which there wasn't any data ( or very little data) that could predict the occurrence.

Nassim Taleb in his book The Black Swan: the Impact of the Highly Improbable alludes to the "knowable unknowns" when he puts forth the black swan theory. In a world where white swans are common, he describes a black swan event as one where

  1. The event is a surprise (to the observer).
  2. The event has a major effect.
  3. After the first recorded instance of the event, it is rationalized by hindsight, as if it could have been expected; that is, the relevant data were available but unaccounted for in risk mitigation programs.

One could argue that the 2008 financial crisis was a "knowable unknown". If one had analyzed the indicators around housing prices and derivatives, the signs of a bubble were apparent. In fact, some people such as the economist Raghuram Rajan, rang the warning bell in 2005.

Identifying "Knowable Unknowns"

"Knowable unknowns" can be both huge opportunities or devastating threats for a business. They may represent new market segments, changes in customer behavior or an underlying security flaw that you didn't know about. So, the question is, if you don't know about it, how do you identify it? Here are a few tips to identify "knowable unknowns".

Focus on Discovery, not Answers

The first step towards harnessing the power of the "knowable unknowns" is to change your analytics methodology. Most analysts focus on answering a question. However, focusing on discovery instead, can lead to a broader set of insights. Be curious about your data. Understand the patterns that emerge from the data and play with it. Run correlations, observe trends, generate summaries. Analyze the data at various levels of summarization. Let the data speak to you.

Increase the Breadth of Signals you Collect

You can only observe data that you collect. However, your business may be affected by changes in demographics, fiscal policies, trade agreements or even the weather. Analysts should think hard about factors beyond the current business fundamentals that may impact the enterprise. Identify datasets that you can use to augment your insights and make them an integral part of your analysis.

Pay Attention to the Outliers

In some industries such as security and financial services, outliers are monitored diligently. For example, a credit card transaction from a zip code that the user doesn't frequent is immediately flagged. However, in most other industries, outliers are ignored. A missing part in a machine, one case of swine flu may all be outliers but may portend the failure of an assembly line or a disease outbreak.

Analysts trim their data to remove outliers. While that may be a sound practice to eliminate noise, you may also be throwing out signals that are disguised as noise. Outliers are critical to discovering "knowable unknowns". Create reports that identify the outliers in your data and observe how they change over time. The Target security breach is a classic case where despite the outliers being reported, no one paid attention to it. While in most cases, the outliers detected may mean nothing, but in the few cases where they do, it might make or break your business.

Image source: Flickr, Black Swan

要查看或添加评论,请登录

Uzma Hussain Barlaskar的更多文章

  • Founder Mode vs Manager Mode

    Founder Mode vs Manager Mode

    Paul Graham's essay on Founder Mode vs Manager Mode inspired by Brian Chesky's experience running AirBnB seems to have…

    7 条评论
  • Clarity in Leadership

    Clarity in Leadership

    Yesterday a colleague wrote to me after a meeting - “ ???????? ?????? ???????????????????? ?????????? ????????????…

    15 条评论
  • How Fouls Impact Game Dynamics in the Football World Cup

    How Fouls Impact Game Dynamics in the Football World Cup

    Friday's match between Brazil and Colombia brought into focus how devastating fouls can be in the World Cup. Though…

    5 条评论

社区洞察

其他会员也浏览了