Watch Out for the "Big Data" Steamroller!
Davis Balestracci
Improvement Consultant / Public Speaker / Author of Data Sanity: A Quantum Leap to Unprecedented Results
Forget statistics for a moment: Responsibly "parent" your big data to make it "mature" first
Statistics: the art and science of collecting and analyzing data.
People think that if you collect enormous amounts of data you are bound to get the right answer. You are not bound to get the right answer unless you are enormously smart. Bradley Efron
There has been an explosion in new technology for acquiring, storing, and processing data. The “big data” movement (and its resulting sub-industry, data mining) is becoming more prevalent and having major effects on how quality professionals and statisticians?– and everyone else?–?do their jobs.
Big data are a collection of data sets that are too?large and complex to be processed using traditional database and data-processing tools. Any change this big will require new thinking.?
Rocco Perla, a colleague for whom I have the utmost respect, feels that even though there is now unprecedented attention and focus on analytics and data-driven decision making, it has also introduced a number of challenges.
It is poor practice to rely on whatever data happen to be available or to assume sophisticated analytics can overcome poor data quality.?The fact that data reside in electronic files says nothing regarding the quality of the data.?Observational data are teeming with reproducibility issues, especially if it's a resulting merge of many different sources.??
Intuitively, as the amount of information increases one would think that the degree of confusion decreases.?But this is true only to a point because the situation eventually reverses, reaching a point where more information leads to increasing confusion – low amounts of information and high amounts of information will both lead to a high degree of confusion!?There are real economic implications of information overload – including a loss of up to 25% of the working day for most knowledge workers.
W. Edwards Deming often made the point that information is not knowledge.?The speed of today’s worldwide instant communication does not help anyone to understand the future and the obligations of management.?Do we really need constant updating to cope with the rapidly changing future??It could hardly be accomplished by watching every moment of television or reading every newspaper!
Data Maturity = Data Sanity
Perla introduces a concept he calls “data maturity” to begin to make sense of all the available data, especially with the growing sub-industry of project work inherent in many improvement philosophies.?It has five characteristics:
1. Projects and their supporting data (eg, reports, dashboards) are viewed as a resource expenditure that adds various costs and complexity for needed support, which should be only temporary.
No more automatic “data on demand.”
2. Projects do not go on forever.?Any project needs formal closure to be retired, after which its measures are also retired.
Any transition from diagnostic and testing data to the consideration of collecting data to hold any gains should be done deliberately and with thought.
3. All measures are operationally defined.
In data mature organizations, operational definitions are viewed as sacred, because they are the only way to ensure a shared understanding of the work to be done.
For example, is Pluto a planet (x=1) or isn’t it (x=0)??It depends:?What is the objective for asking this question??(Never mind how you feel about it!)
4. Improvement measures are clearly and explicitly linked to any changes being tested in the system – and they are collected frequently over a period of time.
The purpose of improvement is to understand?–?as quickly as possible?–?if the changes made to a system are leading to improvement and to inform the next test.
领英推荐
5. The dominant form of analysis includes charts of data collected over time to determine, and react appropriately to, common and special causes of variation.
Organizations that are data mature understand the importance of examining data over time and using these charts as developed by Walter Shewhart (mainly control chart for individuals)?to distinguish between special and common cause variation.?
How does one filter this plethora of information and make sense of it all??
In my experience, virtually without exception, these ideas begin to seriously challenge the way leaders and organizations currently think about information, data, and decision making.??Environmental pressures remain constant, and technology is not going to slow down.?This creates more dependence on the tremendous inertia of the status quo and a toxic blindness to the deceptive power of this different approach's counter-intuitive simplicity.?This blindnesss especially makes people prone to...
...the most seductive waste of all?
Hand-in-hand with the explosion of big data will be an industry trying to sell you solutions for analyzing it.?In fact, a client of mine sent me the figure below:
In this case, the vendor promises highly visual, easy to understand information that will allow insight into not only what the level of employee engagement is, but also the level of leadership effectiveness within each department.?They claim their analyses are statistically robust and repeatable over many years (and I'm willing to bet that, unfortunately, most of them use this all too common (wrong) analysis). Note how they "delight their customers" by also throwing in red, yellow, and green color-coding.
Another claim of theirs that is some possible good news, but for an entirely different reason:?this one tool is a fraction of the cost of a typical survey.?
Well, at least you could save money on all those ongoing silly customer satisfaction surveys?–?how many of your electronic data files are filled with those?
Data has its place, but the challenge is to maximize its ability to serve us humans with all our limitations?–?not the reverse.
Perla concludes that the most effective future leaders will leverage this approach to data with a vision, energy, intellect, and moral compass that come from within?–?not so much from a report or balanced scorecard.?
Let's stop the "data tail" from wagging the dog, shall we?
====================================================
Chapter 2 of my book Data Sanity?demonstrates Perla's point #5 of the deceptive, awesome power of plotting data over time for solving problems of everyday leadership (10 examples of real data from everyday work).
Chapter 5 explains a "mature" data philosophy and Chapter 7 teaches statistical stratification using p-charts and u-charts to compare performances.
Regardless of your improvement approach, Chapters 1 to 4 of my book Data Sanity teach a robust, results-oriented leadership philosophy designed to catalyze transformation to a "built-in" culture of excellence.
See my LinkedIn profile for more information (downloads of Introduction/Preface and brief chapter summaries. It's not only for healthcare).
Read Rip Stauffer's review, published in Quality Digest
Director of Service Operations - Flex Technology Group
8 年Thank you again Davis, I am going to touch on this point also soon within my articles. I have been brought into several organizations paralyzed by charts and reporting. Its really very sad when I see the extremely intelligent personnel within the staff dealing with this. The first activity I enact with the IT department assistance, is "shut off all non-essential reporting via email." I wont go into what non-essential means as it varies by organization. Interestingly the vast majority of reports that are stopped, does NOT create an outcry of personnel asking for the report and the business actually continues on! In fact the personnel can spend more time in PDCA processes.
CEO at ITERATE ? Author of AI-Driven Product Design | Available from Amazon | TikTok ?? @the_product_innovator
8 年I think the critical question here is, Who Framed Roger Rabbit?
Q-Skills3D Interactive learning in Continual Improvement for all employees
8 年I think that excessive fiddling with data is worse than excessive data. It might help sell software but most is quite false statistically and doesn't help users. These days every histogram has a meaningless distribution curve drawn over it to hide any meaning in the data. Control charts are normalized, again making the data meaningless, in the false belief that Shewhart Charts don't work without it. Time is wasted by picking through data unnecessarily, to decide what to base limits upon.