Data quality doesn't matter

Data quality doesn't matter

When you think about data, the first keyword that comes to everyone's mind is quality. This criteria is easy to understand - and drives a lot of investments in transformation projects or in new solutions aiming to clean the pipes. But it's overrated.

I know so many companies that spent 1-10M$+ in projects to clean the customer records and ensure an absolute perfectness of the data that is being acquired internally or externally. Did it drive anything in terms of customer + employee satisfaction, profits, and green impact?

Not really. The obsession for data records' quality is driving massive expenditures and leads to delayed roadmaps - because of the change management efforts required to make it happen.

Now what? Are there other ways and criteria to consider data that would help making an impact for your company and society in 2023?

Yes there are. If you manage to consider your data assets against those criteria, I bet your company will overdrive its objectives for 2023.

Disclaimer: I'm the CEO / founder of Starzdata.com - we're specialised in data acquisition and apply the approach below to review open datasets and third party data providers.

No alt text provided for this image
Criteria used to evaluate Open Datasets and Third party data providers on Starzdata.com

First criteria is perceived quality, not data quality

The quality of a data asset is usually evaluated against accuracy (how does data reflect reality), fill rates (how many rows are complete), number of unique records (do we have duplicates) or number of outliers.

Although this way of measuring data quality is absolutely fine as it translates into quantitive figures, it doesn't tell anything about the quality of the data for a given utilisation.

A dataset containing millions of clients transactions with 40% fill rates and 12% of outliers might be poor for your cash management department. It can be transformative for your marketing department. How would you assess the quality of this dataset?

Assessing the quality of a data asset has to take into account the use case and the targeted context of use. If you do so, you will probably decide to cut by half your data cleaning efforts and discover hidden utilisation opportunities across your companies.

Takeaway #1: ask your internal users about their own perception of the quality of a data asset, and capture the context of the user (which geography does he.she cover, what is the targeted use case, etc.) and its personae. This is what we do on the Starzdata platform when we ask reviews from end users.

Coverage and freshness criteria matter too

How many countries are covered by your CRM data asset? You might realise that the one thing you're investing massively in, does only covers two geographies and neglect the rest of the world.

No alt text provided for this image
Assessing the geograhical coverage of data assets

Coverage and freshness are very much useful, as they turn into numbers. And humans love number (at least I do) to classify things.

Classifying your data assets against their coverage or their freshness (how many times per day / month are they updated?) will help you discovering holes in you baskets and understand why those are not reused within the company.

It's as easy as doing a pivot table in your Excel / Google spreadsheets and will tell a lot about reutilisation opportunities and cleaning priorities.

Takeaway #2: If you have a data catalogue, integrate coverage and freshness as key criteria and make them highly visible to the end users.

Consider end users like internal clients

Ok, you reached the "Data product part" of this post. Bottom line, every piece of data that you're storing within your company - or purchasing from a third party - is a product that needs to meet end users to derive value.

If your CRM data is not used by anyone within the company, it means that there are no clients for the CRM data product. As an investor, would you put a dime on a business that doesn't drive demand ? No.

Therefore Data needs to be consider as a Product. If Data was water, you would consider it as Bottles of water that have a price, a brand. It would be available in some supermarkets and you would invest in marketing to make it adopted by consumers.

For Premium water, you would potentially invest in dedicated customer support to provide guidance for specific clients who are critical for your business.

No alt text provided for this image
How data would turn into data product if it was water (courtesy: @business & decisions)

If Data is a product, which additional criteria should we use to evaluate it?

Five strategic criteria to evaluate your data portfolio

No alt text provided for this image
The five criteria that will help you driving your data transformation in 2023 and beyond

Let's go back to data product. Every product has a price, even internally. You might call it transfer prices, cost allocation, whatever... there should be a price for data also. When considering data product, we also talk about licenses: data is never purchased, you buy a right to use it in specific conditions. There should be licenses internally (and no need to invest on smart contracts & blockchain for that).

Two benefits:

  • Putting a price in front of a data product will help you increase the adoption, get feedbacks on the ROI for this data, and potentially leads to tax optimisation opportunities. That's what every major platform is doing today. When Jeff Bezos launched his "API mandate" towards Amazon employees, bear in mind there was a contract behind each of these.
  • Putting a license behind each data product will help you avoiding unintended fraud and promoting use cases that fail to be communicated internally. How many consulting firms purchase reports and datasets from large data vendors, and fail to enforce the conditions of use towards consultants? How many banks are spending millions in compliance to ensure that the transaction data collected by the retail departments is not used in a suspect way by the trading desk?

Now let's think about the five strategic criteria to evaluate your data portfolio

  1. Easiness of use: how much time and additional skills are required to use the data product and derive value?
  2. User support: is there anyone available to explain how to use the data product, and get me updated on the changes that might arise on formats, quality, etc.?
  3. Licenses agreements: Are the conditions of use sufficiently attractive to address my very own use case, or too restrictive? Are they easy to understand?
  4. Price plan clarity: Is the pricing sufficiently attractive to make a tangible ROI on my very own use case, or too restrictive? Is it easy to understand and plug into my budget?
  5. Sales reactiveness: Is the Data product owner (or the team in charge) reactive for new users in the organisation? Are there any colaterals / webpages describing the data product?

All of these criteria were developed by Starzdata to evaluate open data and providers of data. We had to do this, because there is nothing available today on the market. And it's more than required when you listen to clients as we do it everyday (see this reddit post):

No alt text provided for this image
Sales reactiveness is a real pain today in the data & information economy

The Data product selling exercise

We're in recession time, every dollar matters if you're a Startup or a large corporate.

Engaging your team or your management in selling your data (internally) as a portfolio of data products, will force your organisation to evaluate it and make the right decisions in 2023.

By evaluating your data portfolio against the eight criteria used by Starzdata, you will be able to achieve three objectives:

  1. Drastically reduce your investments in data quality programs and prioritise IT expenditures on high potential Data products
  2. Challenge your expenditures in third party data acquisitions, from Factiva & Bloomberg subscriptions to data directly purchased through AWS.
  3. Discover hidden opportunities to reuse legacy data as data products, and create a massive impact on your P&L without substantial investments.


If you want to know more about how Starzdata can help you addressing those three challenges in 2023 and beyond, or if you simply want to know more about our approach for data product evaluation, drop me a line: [email protected]

Raman Kalia

CX, Omnichannel, InfoSec, Digital Transformation, CRM

1 年

Mathieu Colas one thought that comes to my mind is the silos of data processing units within one organisation and their leaning towards a tool of their choice. As far as CRM is concerned, I’m strong believer that one should also use it as a congregation tool beyond capabilities of data dissemination.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了