Making sense of numbers
North American Cicadas have a 13 or 17 year life cycle, a weird looking number

Making sense of numbers

After a major feature release one of the engineers in the team rushed to tell me that the search conversion rate had increased by 300% according to the team’s data analysis. I challenged the number quite a few times but he and the team had triple-double checked the numbers and insisted that they are counting 3 times more clicks than before the feature release. I insisted back: if the conversion rate has increased by that much, where is all the additional revenue??

Our main revenue source at the time was clicks to merchants, a 300% increase would result in a similar revenue increase but the daily revenue hadn’t changed at all. Well, it turned out that the data were flawed, the actual conversion rate increase was marginal.

Humans have to deal with this type of number validation in their daily and professional lives. Being able to perform a basic validation of any number is crucial not only for making quick decisions but for understanding the world as well. Take fake news for example, almost 99% of the time they will contain some numbers. Those numbers will be most often exaggerated in order to draw attention and make a story believable but that exaggeration is also the reason the stories can be easily taken down by a quick validation.

Numbers are connected.?Not only in the math realm but in the physical world as well.?

What if I told you that North American cicadas remain underground for 13 or 17 years before they emerge? The number is quite big to grasp, it is absurd to believe that a species would choose such a big life cycle and even more difficult to understand why not 9 or 21 but 13 or 17 years. You would have every reason to believe that I just made it up that number but before you go on to check Wikipedia let me assure you that it is a fact and those cicadas actually spend that much time underground.?

There are many explanations as to why, the most accepted one being that 13 and 17 are prime numbers and ensure that when the cicadas emerge their natural predators will not be that “hungry”. If a bird for example has a 3 year cycle it will take 51 years before the bird and the cicada life cycle coincide making sure that as many as possible cicadas will survive.?

Numbers are connected?with little strings that you can push and pull anytime you need to validate a number. If a number is off there is a chance that you can’t validate it on a standalone basis but you can follow its dependencies and effects and validate those.?

As another example, what if I asserted that a factory produces X millions items of product Y daily? One quick validation is to look at the materials required for that product: are there enough in the world? Same goes for those conspiracy theory government spaceships that require amounts of energy the planet doesn’t have (not even our solar system in some cases).

I have many examples where I, or a team I work with, made decisions based on a number that was obviously wrong but based on real observations. With huge amounts of data and complex attribution schemes there is always the possibility that a critical number is calculated in a way that is disconnected from reality yet everyone takes it for granted. Circulating it so that other people will validate it independently is a sure way to minimize errors. Do you have any examples to share?

Katerina Kanteraki

Microsoft for Startups Lead Southeast Europe > startups.microsoft.com

2 年

Love this! Some times people who create decision making reports are so deeply into the analysis that loose the big picture and miss to make basic reality checks.. Miss to spend another half an hour to ask themselves if the figures they created make sense.

回复
John Raptis

Software Engineer

2 年

I think this has to do somewhat with the "Curse of Dimensionality" which loosely states: "The more dimensions a problem or a set of data has, the more sparse their connections become due to the increase of the graph's volume. So obtaining reliable results becomes exponentially more difficult and any piece of data becomes less insightful, even after adding only one extra parameter." I did some writing about this as well a while back. ?? https://curiositysink.substack.com/p/can-we-evaluate-multi-dimensional

Panagiotis Tzamtzis

Head of Data Operations | Baresquare

2 年

You are so right! With the amount of data collected today, you can find connected metrics almost everywhere (especially in web analytics datasets).? That's why, in our anomaly detection platform, we always choose to show how a detected anomaly compares to the changes in its correlated/connected metrics. We automatically notify our users if the detected anomaly (e.g. spike in "Search conversions") was aligned, or not, to the change we expected for the rest of the connected metrics (e.g. unexpectedly stable value in revenue). Besides the use case you mentioned for connected metrics (validating data accuracy), they can also speed up root cause detection!? Imagine how your example would work the other way around. If you saw a spike in "Revenue" and at the same time a spike in "Search conversions" you would probably only focus on "Search conversions" (or prior steps of the funnel) as something there would be the root cause of both spikes in the data.

Evangelos Charalampous

Lead Electrification Engineer. Expert in Railway Electrification. at The Hellenic Railways Organisation (O.S.E. S.A.)

2 年

Very informative and very impressive !!!!!!

回复
Chris Managoudis

CBO @ doctoranytime | Revolutionizing eHealth

2 年

True. Data can lead to very dark places, if one doesnt use his/her intuition and a logic to fight the urge of jumping to conclusions. It is especially hard as we live in a data driven society, requiring hard numbers and reports to validade everything and make a case. We take those numbers, plug them into models to create solid plans and execute with ruthless efficiency. If we do it right, things are supposed to play out, but in many cases we are victims of jumping to early conclusions. We cant blame the data. The nature of the data will always be of high uncertainty, no matter how many equations we affix to a problem, or how vast an ocean of data pools we create. The blame is ours. How do we interpret the data and how much of mental effort we invest to consiously doubt or to accept/validate/hunch-confirm them. And it may be ok to jump to conclusions, if the jump saves much time and effort, and the risk of an occasional mistake is acceptable. But for unfamiliar circumstances, especially when there is no time to collect more info, the stakes are high, making the jump very risky. Thank you for sharing!

要查看或添加评论,请登录

George Hadjigeorgiou的更多文章

  • The back seat gang

    The back seat gang

    Sometimes we get to drive, sometimes we sit back and let someone else drive. The word drive here is not to be taken…

    15 条评论
  • Startups: stop being fancy

    Startups: stop being fancy

    I’ve been long enough in this game to remember quite a few fads like responsive design, NoSQL, BigData, Microservices…

    7 条评论
  • Managing your career

    Managing your career

    I frequently have discussions with people about their career both inside the company I work for and outside with…

    41 条评论
  • The (not so) mythical 10x

    The (not so) mythical 10x

    You probably know how it goes: A study conducted in the sixties showed that some software engineers were 10 times more…

    18 条评论
  • Skroutz: a year in review

    Skroutz: a year in review

    Although Skroutz became an adult last year closing 19 years of operations, this is the third time we have completed the…

    13 条评论
  • The short-term fallacy

    The short-term fallacy

    It is quite often that we make important life decisions based on short-term rewards. Our brains are wired to hang onto…

    5 条评论
  • Embracing chaos

    Embracing chaos

    In my daily routine, and although Skroutz is not a startup anymore, I frequently have to deal with chaos. If you are a…

    13 条评论
  • The Skroutz vision: A glimpse of what comes next

    The Skroutz vision: A glimpse of what comes next

    The word “vision” is one of the most emotionally charged words out there. It is powerful enough to help empires and…

    12 条评论
  • We are hiring!

    We are hiring!

    Skroutz is hiring 60 (yes sixty), engineers at any level. Sure, I’m biased, but if you follow me in the next few lines,…

    11 条评论

社区洞察

其他会员也浏览了