Are you “Thinking with Data”?
Gerard Suren Saverimuthu
Regional Technical Leader based in Singapore | Helping clients to infuse Hybrid Cloud and AI for digital transformation | Cyclist and Photographer
I decided to do some reading about the most valuable natural resource in the world: Data! Max Shron's book "Thinking with Data" (Publisher: O'Reilly Media, Release Date: January 2014) covers methods for telling a good story about a data centric project, and techniques for making sure that we discover full value with data. Here's a super quick summary of the book and key learning points.
Most of us know that "Data science is the application of math and computers to solve problems that stem from a lack of knowledge, constrained by the small number of people with any interest in the answers."
It is very tempting, given how pleasurable it can be to lose oneself in data science work, to just grab the first or most interesting data set and go to town. This book reminds clients and data professionals to look past technical issues (around statistics, computer science) and to focus on making an impact on the broad business objectives.
Most people start working with data from exactly the wrong end. They begin with a data set, then apply their favorite tools and techniques to it. The result is narrow questions and shallow arguments. Starting with data, without first doing a lot of thinking, without having any structure, is a short road to simple questions and unsurprising results. We don’t want unsurprising—we want knowledge. Picking the right techniques has to be secondary to asking the right questions.
The secret is to have structure that you can think through, rather than working in a vacuum. Structure keeps us from doing the first things to cross our minds. The best place to find structure is to create a scope for data problem. A scope is the outline of a story about why we are working on a problem.
There are four parts to creating scope. The four parts are the context of the project; the needs that the project is trying to meet; the vision of what success might look like; and finally what the outcome will be, in terms of how the organization will adopt the results and how its effects will be measured down the line. A mnemonic for these four areas is CoNVO: context, need, vision, outcome.
When a problem is well-scoped, we will be able to easily converse/ write out our thoughts on each. Those thoughts will mature as we progress in a project, but they have to start somewhere. Any scope will evolve over time; no battle plan survives contact with opposing forces.
1. Context:
Every project has a context, the defining frame that is apart from the particular problems we are interested in solving. Who are the people with an interest in the results of this project? What are they generally trying to achieve? What work, generally, is the project going to be furthering.
Contexts emerge from understanding who we are working with and why they are doing what they are doing. We learn the context from talking to people, and continuing to talk to them until we understand what their long-term goals.
2. Need:
When we correctly explain a need, we are clearly laying out what it is that could be improved by better knowledge.
What are the specific needs that could be fixed by intelligently using data? These needs should be presented in terms that are meaningful to the organization. If our method will be to build a model, the need is not to build a model. The need is to solve the problem that having the model will solve.
When we correctly explain a need, we are clearly laying out what it is that could be improved by better knowledge.
3. Vision:
The vision is a glimpse of what it will look like to meet the need with data. It could consist of a mock up describing the intended results, or a sketch of the argument that we’re going to make, or some particular questions that narrowly focus our aims.
Someone who is handed a data set and has not first thought about the context and needs of the organization will usually start and end with a narrow vision. It is rarely a good idea to start with data and go looking for things to do. That leads to stumbling on good ideas, mostly by accident.
4. Outcome:
The outcome is distinct from the vision; the vision is focused on what form the work will take at the end, while the outcome is focused on what will happen when we are “done.”
Figuring out what the right outcomes are boils down to three things:
- Who will have to handle this next?
- Who or what will handle keeping this work relevant, if anyone?
- What do we hope will change after we have finished the work?
As we get surrounded by more and more data, concepts around thinking with data is a must for every IT professional.