The Half-life of Data
[Headsup for a scientific read]
The entire science batch of my junior college had two big groups:
1) Medical students who have taken Physics, Chemistry and Biology as their main subjects
2) Engineering students who have taken Physics, Chemistry and Mathematics as their main subjects
The concept of junior college is unique to India. Generally, students complete this college education in the school. However, as per Maharashtra board, many schools teach only till 10th standard (SSC), for Higher Secondary (HSC) one must go a junior college for two years.
At the age of 15/16, I was too young to experience the local train travelling in Mumbai. My father tried to convince me to take admission college in the vicinity. But I was influenced by my friends and was adamant for admission in Mumbai town side colleges.
My father helped me securing admission at Wilson College. Once upon a time, it used to be a very reputed college in India. It even served as Mumbai University in its initial years. It’s India’s one of the oldest colleges built-in 1832.
Very soon, I felt what a mess I have created for myself. 1.5-hour local train journey was too much for a young student who has just stepped out of a high school and weight 55 kgs. The train travelling involved lot of standing and that caused me immense leg pain at nights. I used to cry and suffer beneath the shadow of my mom and dad. They always encouraged me to fight and never quit.
Let us come back to the topic of science. I took a combination of Physics, Chemistry, Mathematics and Biology. I am not a risk-taker by nature and didn’t want to close any options for me. I enjoyed studying mathematics and biology side-by-side. And organic chemistry was my favourite. More on these topics later.
Physics was a subject that drew my attention the most. In physics, I studied a chapter on radioactivity. The chapter was short and gave a brief overview. However, it was enough to leave a long-lasting impression on my mind.
I read the concept of Half-life. I was deeply impacted by its simple maths. The half-life of a substance is the time taken by the 50% of the substance to disintegrate (convert) into another form by radioactive decay. Radioactivity is the emission of alpha rays, beta rays and gamma rays from an element to attain stability. Radioactive elements like Uranium, Radium are very unstable i.e. they emit more radiation so that they can disintegrate into lighter and stable form like Thorium.
The Half-life of Uranium is 4.47 billion years. For example, if you take one kg of Uranium, it will take 4.47 billion years to disintegrate to 500 grams. Amazing! Right?
The speed of radioactive decay is directly proportional to the quantity of the substance. The most fascinating thing about this decay is that it’s infinite time to decay to completely. It never becomes zero. This means that 1 kg of Uranium will take an endless time to convert to 1 kg of Thorium. Because as the quantity decreases, the speed of decay aka the emission of rays also decreases. If I were to plot a graph of this, it will look like this:
If you have read till here, I believe you have some affinity to physics and science in general. I am obliged that you took so much effort and invested time to read this. Enough of science, let me jump to the title of this article now.
After completing my graduation in Information Technology, I joined Great Place to Work Institute. This workplace is no different than college. It’s a heaven for data scientists. I was suddenly exposed to massive rich data sets. Within a few months of my joining, I was managing the quality of survey data that is used by the organization to publish India’s Best Workplaces List.
I thought this concept that time and couldn’t structure it then as I was very young and was still learning how to speak and write. I felt that every data is very similar to radioactive elements. This thought stayed with me for the last six years. I am throwing some light on it.
I feel a survey data loses its value with time. For example, the census carried in 1971 is less relevant compared to the census conducted in 2011. The census of this year will have the highest relevance. However, the old data never becomes irrelevant, it just becomes less relevant.
So, survey data also loses its value with time. It decays by emitting stories and knowledge. However, it never becomes obsolete. I feel this applies to all forms of data.
This is a different form of data decay that I am talking here. You would’ve heard about data corruption and leak; however, this is very different. The Half-life of data is the time taken by the data to lose its value by 50%.
In the context of survey data and especially employee engagement, this half-life is very small. For example, let’s take you to have 1000 employees in your company. You conduct a sample survey covering 141 employees at 80% confidence level and 5% margin of error. The survey completes in 2 weeks with a 70% response rate. It takes 1 month for the results to reach to you. You take another 2-3 weeks’ time to analyse the results and draw action plans. And if you have a 10% attrition then in 2 months there’s a probability that 10% of sample survey participants would’ve left the company. So, the survey results which is 80% accurate has its value reduced by 50%. Below graph explains this phenomenon.
This episodic survey produces data which has a very small half-life. What’s the solution to this question of the relevance of data? You can increase the value by covering a bigger sample size or full scale, driving higher response rate and acting on data faster. However, these points look good only on paper, not in reality. Covering bigger sample size has commercial and logistics implications, too much pushing for response rate will introduce noise in data. The office bureaucracy doesn’t allow to act on data quickly.
The episodic surveys, despite all their downside, are still very important. They are like a final exam a student gives a year. They help to identify gaps in workplace culture and gives a very high-definition snapshot of your culture. What an organization has is a big black void space between two episodic surveys. During this time, there’s no way to know what’s happening with the people in the company.
To bridge the gap between two episodic surveys, Great Place to Work has created an ecosystem of real-time feedback. Real-time feedback enables workplace leaders to track their culture like the weather. Just like weather reports and stock exchange, real-time feedback dashboard gives leaders a fair idea of the pulse of the people.
Real-time nature of the dashboard helps the data to refresh automatically and one can see a fresh picture of the company culture anytime. The data has a shelf-life and it refreshes every day. This ecosystem eliminates the complexities introduced by sample episodic surveys which is acted in months. This is a tool that enables you to act while you are on feet. It’s not an action planning tool, it’s a tool for taking actions and get instant feedback on the actions.
Individual anonymity and ease of access make this ecosystem credible and user friendly.
I wrote an article on Data as a form of energy and this time, I thought of elaborating on another concept which is centred around Data. Since data is the new oil, it must be viewed differently.
Note: Views expressed are personal
Ethical Sourcing Manager at Twinings
4 年Interesting insight, Ashish. Overcoming the bureaucracy and acting on data insights really is key to making the most of data collection, with which many struggle.
Great Manager Institute
4 年Conceptually the half life of "experience" would be a great idea to explore. For example, 30 years of experience of pre-COVID may have a short half life post-COVID. The graph may not be very different from the half life of data.
Digital Marketer
4 年Was a great read Ashish.
Published Author | Author of "The Happiness Revival" | Senior Consultant Great Place to Work India | St. Xavier's College
4 年It was absolute pleasure to read the entire article!! Very well written!?