Data - Oil or Rust?
[This is a replug from my old substack post ]
The New Oil
In the past few years, it has been oft quoted that “Data is the new oil” !
It was coined to denote the value it carries. Like oil brought prosperity to the countries rich in petroleum, Data has helped create trillion dollar companies.
The metaphor also can be interpreted as speaking about the effort needed to extract it, similar to fossil fuels.
Chinks in the armor
The metaphor is very relevant and all of us can relate to it. But it also makes us feel that just having data makes us do wonders. But there are lots of inefficiencies that can crop up. Some of them are:
1. Creaky Data - Quality of Data inputs
Here I would like to bring in a comparison with creaking machines gaining rust.
When we start a new company / business, we think and plan what data to capture, how to transform and how to use it. Smooth running machines.
Over time, we just keep adding more and more layers of complexity to it. The machines might start creaking with overload. Thanks to cloud computing, this aspect has become easier to manage.
But this complexity also makes us miss the underlying assumptions.
For example, name of the user. How is the data captured? Who is capturing the data? Is there any incentive for the customer to give it accurately? Is there any more information that can be useful to us? How can we avoid bias in responses?
If these underlying assumptions aren’t questioned from time to time, the quality and depth of data could be affected.
And if this keeps piling up over a period of time, the machines get rusted. Poor input means poorer analysis and poorer insights.
2. Extracting it right - Sitting on a Gold Mine
The other way, we sometimes miss the bus is not recognising the kind of data we already have.
May be we analyse day wise trends.
But what if we have timestamp already available and didnt know about it. We could have asked questions like
How have they been behaving by the hour?
Is there a start and end time to a transaction?
How different is that? Can it give more insights?
领英推荐
3. Centralising - “Having it all in one place”
In the data world, this might be the utopian world. Get all the data in one data warehouse / data lake / any other form of architecture.
Though it sounds basic, many companies struggle to reach it both in technological terms as well as from point of view of getting access.
This has the potential to both keep things simple (no duplication, lesser master data issues, etc) as well as unleash the power of correlating data across seemingly unconnected data streams.
This is what all the big data companies seem to do well and become super efficient at it. And at the scale they handle, the results are phenomenal ! So much so that AWS, which was born as an internal source of efficiency, is one of the largest revenue streams for Amazon now.
4. Context - Do you know your business?
Let us say we have all the data in the world we need. Next comes the question, how relevant is it? Are we able to interpret what we are getting? Which area is the most critical for business?
There is a need to understand context as well as prioritise which areas to look at. A particular insight might help 10 customers a day while another could be impacting a million!
The importance to understand and build on that business context is very important.
With the advent of deep learning models, the input to output relationship seems like a black box. Which works in areas where is no need to know “why” or “how” as long as the output helps predict.
But for many areas of business, the context and relationship is important. It helps understand if we have gone wrong somewhere and can there be a fundamental correction in the way we are approaching a business problem.
It also helps check on the tricky topic of cause/effect relationship too.
5. “Running Hot Water” - AI/ML Mirage
Machine learning (ML) techniques have started being successfully used in many areas like recognising pictures, speech translation, map navigation, etc. And these techniques are helping areas of Artificial intelligence (AI) like chat bots, general search, recommendation engines, etc.
There is the quote which says that for one with a hammer, everything seems like a nail. Due to the buzz around it, any of us have been attracted to the craze behind AI/ML. But we need to tread with a lot of caution. Not all problems need to be solved with complex ML techniques. Simple slice/dices across relevant metrics like geography, etc. could do the trick.
Some places, might deeper look at processes to under cause/effect relationship to come out with the area to focus on.
And many places, esp in Chat Bots, people have been using terms like “AI” which are basically a set of IF/ELSE conditions. And many a time, such simple conditions might serve 90% of the needs!
Need to be very careful to see through some marketing gimmicks!
[Insert: With the advance of ChatGPT and other LLMs (Large language models), this mirage can now turn out to be true! ?? ]
Some of these stories reminds me of a brilliant SBI Cards ad where a hotel advertises “Running Hot Water” and actually has a kid running with a bucket-full of hot water for the guest. ??
Conclusion
Data is definitely an important lever for any business for taking action. But apart from just engineering techniques, it needs a lot of thinking and experimentation to get the best out of it.
Thanks for taking your time reading this. Hope you liked it. Please do feel free to share! See you next week!
PS: As you can see, the title was just a bit of intimidation ??