Data Doodling
Gurunath Hari
CTO | Guiding Leaders make employees feel valued - PEAKISM? Sustainability & Holistic Wellbeing Analytics | Sales Productivity, Digital Transformation, Partnerships, L&D | BestsellingAuthor | ICF Career & Sales Coach
With the explosion of data came the ability of employees to access it. Over a period of time the competency needed to pull data out of deeply ensconced databases became easier and more intuitive. From having to learn R or writing python scripts to access data, to dragging and dropping icons onto a canvas e.g. Apps like Alteryx. Now anyone with a stub for a finger and 2 brain cells can start doing wonders. What do i mean by that?
Its called Data Doodling!
When you combine speed of access, with powerful data engineering in a few clicks. Add to that a dash of data visualization, and you start entering the realm of what i call Data Art.
While beautiful purposefully curated data and vizs are the norm today, i hypothesize that breakthroughs using data is more likely to come serendipitously via Data-doodling. We know doodling is?the act of creating drawings in an unconscious or unfocused manner. What is data-doodling ?
Data-doodling is the act of creating data associations in an unfocused but consciously curious manner.
Doodling takes no skill except
- in this case the drawing implement is the icon-set on the screen and a mouse. You could say it's complimentary to machine learning, where data is meticulously cleaned, balanced, crunched, chopped, trimmed, put thru torturous iterations using algo's like Random Forest, XG Boost etc. and yield a highly anticipated and pre-determined outcome.
Example of a data-doodle
The above heatmap was a result of data-doodling. It reveals the associations of the results of 100s of employee assessments along 6 dimensions of well-being. Post facto it looks as a legitimate milestone in exploratory data analysis but in reality it was just doodling.
Advantages of Data Doodling
Not all answers are known and we know there are questions that we don't yet know to ask. With wall-to-wall implementation of data-analytic tools, the sheer number of employees able to indulge in data-doodling is already in millions. It is likely that 99.9% of the data-doodles may at best provide amusement value and at worst have just burned compute resources, but that 0.1% that hit on a Eureka! idea, may well result in any one of:
The above data-doodle shows each wellness dimensions is almost independent of the other.(very low correlation). That led to wanting to explore the efficacy of the questions in the questionnaire a little deeper. Which in turn led to computing the Cronbach Alpha, which rounded out to 0.8 which validated the integrity of the questionnaire as a whole!
Key Takeaways
Data is the new oil they say. I'd say data is the new oil-paint. Each dataset a different color that is looking to be blended and laid out. Today its possible to doodle with this medium, and we call it data-doodling. While beautiful purposefully curated data and vizs. are the norm today, breakthroughs using data is more likely to come serendipitously via Data-doodling.
Further research
I invite students studying datascience to look at conducting research to test this hypothesis:
H0: Breakthroughs are not a result of data-doodling.
Structure the study in such a way that the test group are made aware of what data-doodling is and ensure they have proficiency with the tools they will use. Don't limit it to just visualization level but go deeper.
There could be a data Picasso in everyone!
Go ahead! Try it.
Acknowledgements:
Many thanks to my Data science mentors and teachers: @sayan