Data Scientists think data is their #1 problem. Here's why they're wrong.
James Taylor
Leading authority on Digital Decisioning and delivering business impact from AI and machine learning
I often see articles or posts that identify data integration or preparation as the key issues facing data science projects. This always puzzles me as this is not our lived experience - not what we see when we work with Fortune 500 companies adopting predictive analytics, machine learning or AI. But I think I have figured it out. The problem is as follows:
What data scientists think counts as a "data science project" is not, in fact, a data science project.
Let me illustrate this with some data from a great study. Back in 2016, the Economist Information Unit did a survey on "Broken links: Why analytics investments have yet to pay off" and below you see how this data appears to support the argument that data problems are #1.
Wow - pretty clear that Data integration/preparation is the biggest problem with nearly twice as many projects reporting it as a problem as the next one.
In fact, though, this is a subset of the data from the survey. Here's the full data set:
Data integration and preparation only ranks #4. Problem definition/framing, Solution approach/design and Action/change management all rank higher. This is our experience.
In large, established "grown-up" companies, data science projects fail for one or both of two reasons:
- They are solving the wrong problem. They are building an analytic that is not what the business need, that will not solve a true business problem or that is poorly designed to fit into the business context.
- Because they cannot action the model they build. They can't change the business decision making to take advantage of the analytic by changing the decisions made and actions taken.
And this illustrates the problem.
The problem is that data scientists THINK their project starts with data and ends with the communication of their analysis. If that's your focus, then data is your #1 problem.
But this is not where data science projects start nor where they end. They have to start and end with the business. That means starting with a business problem - a business decision that the business wants to improve - and ending with that problem being solved - the business behaves differently (better). If that's your focus, then your problem is not data but problem definition and operationalization - making the analytic work IRL.
Here's the difference, shown on those phases. On the left, what many data scientist think their projects involved and on the right, what it really involves.
Bottom line: If your data science team is telling you that data is their #1 problem then they're doing it wrong
I've written about this before - check out this article on the study itself and this one on adopting decision modeling as a better way to define the problems your data science team is trying to solve. You might also like our recent white paper and videos on Building an Analytic Enterprise.
Feel free to connect or message me with questions and comments. And if we can help your data science team start working on a better definition of a project, we'd love to.
Accelerate Growth, More Profitably, and with Less Frustration
4 年Great article, James. Thanks for sharing!
Data Insights Practice Lead | Passionate about delivering business outcomes through Data & Analytics | Delivered strong revenue growth | Developed and grew Data & Analytics Team and Practice - Seeking next opportunity
4 年Completely agree. What is often left out when framing the problem is the context, sometimes the problem is too tightly focused, the wrong questions are asked and no consideration is given to what actually might be done with the insight. Projects that design a fully thought through way to start from problem framing through to then testing for business outcomes from derived insights, with all associated business process and change management, and behavioural understandings, will have a much greater chance of sustainable success.
Helping companies transform their business with data
4 年Great article, James! Congrats!
Fifty years in software | Learning every day | Still an optimist
4 年James, based on my 45-year career, I concur with your analysis. Our challenge as practitioners is to deliver practical results. We must first ask "the business" if and how they intend to act on the results before spending any of their time and money on a "data science project." Our primary tool is listening; math is maybe second or third.
Director en Analitika & Intelligence Center/SCIP-Solution Provider Individual Member
4 年A pleasure to read your opinion!!!! I think the same way A lot of time without communicating!!!