Making sense of it all - the skill you should invest time in

Making sense of it all - the skill you should invest time in

While “making sense of it all” might be an exaggeration, I do fundamentally believe that one subject is particularly important for anyone working in today’s business world: statistics. The ability to understand data, craft surveys and draw the right conclusions have been accompanying me in all my previous roles and especially at Amazon, we are obsessed with data. Being able to understand the concepts of statistics, from statistical significance to correlations or being able to construct a reliable survey, your statistical toolbox will come in handy especially with data analytics and AI / ML on the rise.

The value of statistical knowledge

Understanding the future with AI / ML

The strong growth of machine learning applications across businesses is not surprising. Now that companies have the means to capture vast amounts of data, everyone wants to put the data to use, to build better products, make better product recommendations or improve efficiency. The AI market alone is expected to grow from more than $ 300 Billion in 2021, to over $ 500 Billion in 2024 [1] . The potential is huge and companies are staffing up on data scientists. While I do not suggest that basic statistical knowledge will make you a data scientist, it does however help you to understand the basic mechanisms behind data science models. It will help identify potential use cases, where AI / ML can be applied to solve a business problem. By understanding the concepts of statistics, how correlations work, confidence levels, control variables and so on, you will easily grasp how models are trained and consistently improved.

Constructing mechanisms to collect data

Whether you are in Sales, Marketing, Finance or any other business function. Decisions should - whenever possible - be made using data: Where should you spend your marketing budget? Which products do your customers like best? We often answer those questions using data points we collect from various sources. When trying a new process, tool or product, we often choose to construct surveys to get people’s feedback. With well constructed surveys, we can reliably assess and evaluate processes by regularly checking whether we are improving or not. Both the mechanics of creating surveys is important, but also the ability to define which data points are useful in proving a hypothesis.

Let’s assume you implemented a new internal process and want to understand which users are more comfortable applying the new process. In this scenario, you should think about the data points you want to collect on the participants (maybe department, tenure, skill set) in order to determine whether these attributes influence how well your process is adopted. Building good surveys takes time and too often is done poorly.

The one concept you should understand and keep in the back of your mind

Now having a grasp of the working of AI / ML is tremendously helpful, along the same lines there is an important single aspect of statistical know-how which I want to point out. You will see people getting it wrong very, very often. The difference between data that correlates and data that has a causal relationship. Causal relationships, in simple terms, are relations where there is a clear cause and effect relationship between A and B, with A having a direct influence on B. Let’s say a rock (variable A) falls on a glass window which therefore shatters (variable B). In this case it is clear, that the rock caused the window to break. In the world of statistics (and business), you usually do not have the luxury of these kind of clear cause and effect relationships. You will find correlations, variable A may have some effect on variable B or vice versa. Many times, people will get this wrong though and mistake correlation for causality, while reality is more complex. Let’s take these two examples:

  1. It is the year 2019, a new CEO joins an eCommerce company and sales are thriving over the course of 2019 and 2020. The conclusion: the new CEO is doing a tremendous job, just look at the sales numbers! There could of course be truth to it, but again you should be careful not to mistake correlation with causality. The timing correlates and sales are thriving with the new CEO, but another external factor, in this case a global pandemic is likely a key contributor to the spike in eCommerce sales of the company.
  2. You might have heard that CEO’s read a lot. If you now assume a causal relationship between reading and becoming a CEO (one leading to another), you are mistaking causality with correlation. The statement may very well be true, but it is likely that it was not only the reading that turned people into CEO’s. They typically have a higher education, which in turn will make them more prone to read books. So you have additional factors coming into play, far beyond a simple variable A leads to variable B relationship.

Grasping this concept is tremendously helpful as you can easily spot the use of data being misinterpreted or correlations being presented as causality. It makes you realize that often data requires a lot of digging in order to find true meaning and valuable insights. We hope for easy answers, but truth is often more complicated.

Let me know your thoughts!

[1] IDC Forecasts Improved Growth for Global AI Market in 2021





Utpal Utkalit

Salesforce Senior Technical Architect

3 年

I agree,we feel tempted to derive conclusion without looking in other aspects of it,your example 1 is classic case,possibility is the earlier CEO had a vision and spent the revenue on making the org future ready.The idea is to understand and analyse the underlying pattern for better prediction

Engin Caylak

Head of Internal Audit Europe bei Schaeffler

3 年

????

回复
Matthias Bullmahn

? Dein unfairer Vorsprung im Marketing für mehr Umsatz

3 年

awesome article, thanks for sharing your thoughts Michael Gerrity

回复

要查看或添加评论,请登录

Michael Gerrity的更多文章

社区洞察

其他会员也浏览了