登录查看更多内容

Data Science “Paint by the Numbers” with the Hypothesis Development Canvas

Bill Schmarzo

Dean of Big Data, CDO Chief AI Officer Whisperer, recognized global innovator, educator, and practitioner in Big Data, Data Science, & Design Thinking

发布日期: 2018年10月23日

When I was a kid, I use to love “Paint by the Numbers” sets. Makes anyone who can paint or color between the lines a Rembrandt or Leonardo da Vinci (we can talk later about the long-term impact of forcing kids to “stay between the lines”).

Well, the design world is applying the “Paint by the Numbers” concept using design canvases. A design canvas outlines what’s important given the subject area, and then allows the “painter” to color in the right information. A design canvas is a one-page operational template that is designed to capture all of the different perspectives necessary for successful execution depending upon the problem being solved. A great example of a canvas is the Business Model Canvascourtesy of Strategyzer (see Figure 1).

Figure 1: Business Model Canvas courtesy of Strategyzer

The Business Model Canvas forces organizations to “paint in” your business’s value proposition, cost structures, revenue streams, supplier network and customer segments. Overnight, everyone can become Jack Welch!

Now you are ready to take the next step from a Big Data MBA perspective by building off of the Business Model Canvas to flesh out the business use cases – or hypothesis – which is where we can become more effective at leveraging data and analytics to optimize our the business. That next step involves the newly created Hypothesis Development canvas.

Introducing Data Science Paint by the Numbers

The one area of under-invested in most data science projects is the thorough and comprehensive development of the hypothesis or use case that is being tested; that is, what it is we are trying to prove out with our data science engagement and how do we measure progress and success.

To address these requirements, we developed the Hypothesis Development Canvas – a “paint by the numbers” template that we will populate prior to executing a data science engagement to ensure that we thoroughly understand what we are trying to accomplish, the business value, how we are going to measure progress and success, what are the impediments and potential risks associated with the hypothesis. The Hypothesis Development Canvas is designed to facilitate the business stakeholder-data science collaboration (see Figure 2).

Figure 2: Hypothesis Development Canvas

The Hypotheses Development Canvas includes the following:

Hypotheses Description and Objectives – what it is the organization is trying to predict and its associated goals (e.g., reduce unplanned operational downtime by X%, improve customer retention by X%, reduce obsolete and excessive inventory by X%, improve on-time delivery by X%).
Hypothesis Business Value from the Financial, Customer and Operational perspectives; that is, what is the rough order Return on Investment (ROI) from successfully addressing the hypothesis.
KPI’s against which to measure success and progress, and the exploration of the risks associated with potential 2ndand 3rdorder ramifications of KPI’s. See the blog “Unintended Consequences of the Wrong Measures” for more details on 2ndand 3rdorder ramifications of KPI’s.
Decisions – the what, when, where, who, etc. – that needs to be made to support and drive actions and automation in support of the hypothesis’s business, customer and operational objectives.
Potential data sources to explore including a brief description and why the business stakeholders feel that might be an appropriate data source to explore.
Risks associated with False Positives and False Negatives (Type I and Type II Errors); the risks associated with those scenarios where the analytic model is wrong.

A Vision Workshop accelerates the collaboration between the business stakeholder and the data science team to identify the hypothesis requirements that underpin data science engagement success.

The Machine Learning Canvas (Big Data MBA Version)

Now to complete the loop, I introduce the Machine Learning Canvas. I stumbled upon the Machine Learning Canvas v0.4 from Louis Dorard at the web site “Machine Learning Canvas.” Louis has made his canvas freely available, and I will do likewise with the additions that we made to his canvas based upon our unique data science requirements (see Figure 3).

Figure 3: Machine Learning Canvas (Big Data MBA Version)

For purposes of our data science work, we needed to add to panels:

Prescription: Once we have a prediction, what do we do with that prediction?
Automation: How to we automate standard procedures with the prescriptive insights?

Summary

A successful data science engagement requires close collaboration with the business stakeholders throughout the development process to:

Understand and quantify the sources of financial, operational and customer value creation (it’s an economics thing).
Gain a thorough understanding of the KPI’s and metrics against which we are going to measure progress and success, and in particular, the potential second and third order ramifications of those KPI’s and metrics.
Brainstorm the variables and metrics (data sources) that might yield better predictors of business and operational performance.
Codify the rewards/benefits and the costs/risks associated with the hypothesis (including the risks and costs associated False Positives and False Negatives).
Close collaborate with the Business Stakeholders to understand when “good enough” is actually “good enough” from a predictive analytics perspective.

We now have the three design canvases that allow us to not only link our business model to the data science and machine learning efforts, but the data science team can now have a direct “line of sight” from how their work impacts the business models.

Figure 4: Linking Business Model to Data Science to Machine Learning

So to quote the famous American philosopher and part-time groundskeeper Carl Spackler, “Now I’ve got that goin’ for me, which is nice.”

Jose Arevalo

Subgerente de Transformación Digital

4 年

Excellent!

Sander Stepanov , Ph.D. TURNING DATA TO MONEY

***** World-Class Super Doer and Anti-Hype Artificial Intelligence *****, Generative AI, Large Language Models LLM, Vector Databases, RAG, Embeddings, Machine Learning

4 年

MAY YOU SHARE LINK TO DOWNLOAD THIS FILE?

Alan Justino da Silva

Senior Software Engineer at PolySwarm

6 年

Samuel?I am wondering what do you think about it.

Mihai Ionescu

Strategy Management technician. 20,000+ smart followers. For an example of a strong nation, look where European cities are bombed every day by Dark Ages savages. Slava Ukraini! ????

6 年

WHEN 'ANALYTICS' GUYS ENTER THE STRATEGY DOMAIN https://goo.gl/o9crVu .

Mihai Ionescu

Strategy Management technician. 20,000+ smart followers. For an example of a strong nation, look where European cities are bombed every day by Dark Ages savages. Slava Ukraini! ????

6 年

I presume that you regard the business application of hypothesis, since you've started from a business model concept, on the left side. You also use predictions and hypothesis (more or less interchangeably), which indicates that you consider future business states/positioning, therefore a Strategy-related application of hypothesis. Two things: (1) If you don't have a methodology framework (for Strategy) that defines the required hypothesis nodes and process milestones, you don't have a structured, deterministic, causal hypothesis construct. You only have a haywire mix of hypothesis, predictions and decisions. And you can DO MORE HARM THAN GOOD. Unfortunately, the BMC (Business Model Canvas) is the WRONG [Strategy] methodology framework to start from (see here why: https://goo.gl/SnKXMF). (2) You cannot build some sort of making-better-hypothesis juggernaut, because you are bound to fail miserably, all the time, even if you would put it on ML steroids. Practice has thought is at least that much. And read something like the Strategy Paradox to better understand why (https://goo.gl/xXcNPV). Try the more successful and empirical Strategy Dialogue (https://goo.gl/RPQEhi), which uses a diversity of human perspectives and experiences, brought together to filter the good/bad hypothesis with surprising efficiency, in practice. And now back to the hypothesis construct. Read this and then re-read your article(s) from the resulting updated perspective: The Chain of Strategy Hypothesis https://goo.gl/uyc7EN ... One more thing: Whatever you do, some of the hypothesis in any anticipatory set of ideas, WILL ALWAYS BE INVALID, because it's about the future and, as Pierre Wack once said, "It is impossible to forecast the future and it is foolish to try to do so." (https://goo.gl/MNqEoY). Therefore, start by assuming that some of the hypothesis are invalid (although, we do not know which ones) and get the adaptive system in place, to replace them as soon as their validation becomes possible (proving that they are invalid). Something like this: https://goo.gl/v4Ngw5 .??

查看更多评论

要查看或添加评论，请登录

Bill Schmarzo的更多文章

Why Everyone Needs to Think Like a Data Scientist in Today’s Environment

2022年7月16日

Why Everyone Needs to Think Like a Data Scientist in Today’s Environment

The rise of data is driving an unprecedented wave of business opportunity across all business areas. However, with such…

39 条评论
Data Management Sessions at Dell Technologies World 2022

2022年4月25日

Data Management Sessions at Dell Technologies World 2022

Data, data everywhere…not a byte to use! As much as enterprises are getting ready to brace for the Data Decade, it is a…

18 条评论
Mastering the Data Economic Multiplier Effect and Marginal Propensity to Reuse

2021年6月6日

Mastering the Data Economic Multiplier Effect and Marginal Propensity to Reuse

Note: this blog introduces the concept of the Marginal Propensity to Reuse which is the primary driver behind the Data…

29 条评论
Data Science 2.0: From Analytic Outputs to Business Outcomes

2021年4月25日

Data Science 2.0: From Analytic Outputs to Business Outcomes

The “Data Science Learning Roadmap for 2021” in Figure 1 created by FreeCodeCamp does a great job of articulating the…

5 条评论
Data Science 2.0: From Analytic Outputs to Business Outcomes

2021年3月9日

Data Science 2.0: From Analytic Outputs to Business Outcomes

The “Data Science Learning Roadmap for 2021” in Figure 1 created by FreeCodeCamp does a great job of articulating the…

5 条评论
Digital Transformation Requires Redefining Role of Data Governance

2021年2月8日

Digital Transformation Requires Redefining Role of Data Governance

I’m overjoyed to announce the release of my latest book “The Economics of Data, Analytics, and Digital Transformation.”…

17 条评论
Master Machine and Human Learning to Win the Digital Transformation Wars

2021年1月18日

Master Machine and Human Learning to Win the Digital Transformation Wars

The “Economies of Learning” are more powerful than the “Economies of Scale” This may be my most powerful concept…

12 条评论
Crossing the Analytics Chasm with Nanoeconomics

2021年1月11日

Crossing the Analytics Chasm with Nanoeconomics

“I love it when a plan comes together” – John (Hannibal) Smith, The A Team One of the biggest challenges that I…

16 条评论
Ethical AI, Monetizing False Negatives and Growing Total Addressable Market

2020年12月28日

Ethical AI, Monetizing False Negatives and Growing Total Addressable Market

What if I told you that companies that don’t embrace Ethical AI are leaving significant amounts of “Money on the…

5 条评论
Mastering Nanoeconomics in the Era of Digital Transformation

2020年12月21日

Mastering Nanoeconomics in the Era of Digital Transformation

As I state in the opening paragraph of my new book “The Economics of Data, Analytics, and Digital Transformation”: “The…

11 条评论

See all articles

Data Science “Paint by the Numbers” with the Hypothesis Development Canvas

Bill Schmarzo

Dean of Big Data, CDO Chief AI Officer Whisperer, recognized global innovator, educator, and practitioner in Big Data, Data Science, & Design Thinking

Introducing Data Science Paint by the Numbers

The Machine Learning Canvas (Big Data MBA Version)

Summary

Bill Schmarzo的更多文章

社区洞察

其他会员也浏览了

Future of Data and Data Driven Decision Making (DDDM)

Power of Big Data, Analytics, and Data Science:

Mastering Time Series Analysis from Scratch: A Data Scientist's Roadmap

Deep Scope: A Comprehensive Analysis of Data Visualization and Manifold Learning

10 Essential Thinking Tools for Data Scientists: Solving Business Problems with Analytical Frameworks

Empowering Decisions with Data Science: Insights for Professionals and Enthusiasts

Leveraging Data Science for Strategic Business Analysis

Understanding Data Science and Its Workflow

Mastering Vector Embeddings: A Comprehensive Guide to Revolutionizing Data Science

What are the 3 Stages where your Data Science Teams might Fail?

Introducing Data Science Paint by the Numbers

The Machine Learning Canvas (Big Data MBA Version)

Summary

Bill Schmarzo的更多文章

Why Everyone Needs to Think Like a Data Scientist in Today’s Environment

Data Management Sessions at Dell Technologies World 2022

Mastering the Data Economic Multiplier Effect and Marginal Propensity to Reuse

Data Science 2.0: From Analytic Outputs to Business Outcomes

Data Science 2.0: From Analytic Outputs to Business Outcomes

Digital Transformation Requires Redefining Role of Data Governance

Master Machine and Human Learning to Win the Digital Transformation Wars

Crossing the Analytics Chasm with Nanoeconomics

Ethical AI, Monetizing False Negatives and Growing Total Addressable Market

Mastering Nanoeconomics in the Era of Digital Transformation

社区洞察

其他会员也浏览了

Future of Data and Data Driven Decision Making (DDDM)

Power of Big Data, Analytics, and Data Science:

Mastering Time Series Analysis from Scratch: A Data Scientist's Roadmap

Deep Scope: A Comprehensive Analysis of Data Visualization and Manifold Learning

10 Essential Thinking Tools for Data Scientists: Solving Business Problems with Analytical Frameworks

Empowering Decisions with Data Science: Insights for Professionals and Enthusiasts

Leveraging Data Science for Strategic Business Analysis

Understanding Data Science and Its Workflow

Mastering Vector Embeddings: A Comprehensive Guide to Revolutionizing Data Science

What are the 3 Stages where your Data Science Teams might Fail?