登录查看更多内容

Match Rate B******t

Mark Pilipczuk

Advisory | Marketer | Board Member | Слава Укра?н?!

发布日期: 2024年7月15日

A couple of days ago, Tom Goodwin pointed me to this article(1) in Mi3Australia discussing the abysmal quality of much targeting data. I agree with a lot of the findings, as I wrote here.

What was really interesting was the the links to a couple of research papers, particularly "Is First- or Third-Party Audience Data More Effective For Reaching The 'Right' Customers? The Case of IT Decision-Makers," by Nico Neumann Catherine Tucker Kumar Subramanyam, and John Marshall, published September 10, 2023.

(2) My annotated version of the paper is linked below.

I've always hated match rate as a KPI.

And I don't like ratio metrics unless I absolutely understand and trust both the numerator and denominator. Otherwise, you're just dividing numbers. The paper above by Neumann, et. al., shows us some crystal clear examples of why Match Rate is bullshit, even though that wasn't the intent of the paper.

In the paper, the authors tested purchased records of IT decision makers (ITDM), as well as a marketer's own CRM database of ITDM, and compared the results across four different onboarding platforms. But they did a great test by also obfuscating some of the data before testing at the onboarder.

I've summarized the data in the following tables

Table 1: Overall data, match rates at left with indices at right

Table 2: Onboarder match rates for prospecting data

Table 3: Onboarder match rates for CRM data

The test construction was clever and simple enough that even the most ill-intentioned onboarder wasn't able to corrupt the test. The core element to the test design was the addition of a random 5 digits to the real email addresses before they were hashed and presented to the onboarder. Doing this turns the HEM (hashed email) presented to the onboarder to gibberish.

Let's see how they did: Prospecting data, table 1

As you can see the match rates of a HEM + name varies from 12.3% (Facebook) to 75.2% (the unnamed "Onboarder 1"). It is interesting that Facebook and Google, who basically see everybody and have practically unlimited resources, have lower match rates than the unnamed onboarders, who both dramatically over-perform Google and Facebook.

It's when the emails are salted with the five digits and turned to gibberish that it gets interesting. Both Google and Facebook--who basically see everybody and have unlimited resources--see their match rates drop to sub 1%, as does Onboarder 2. Between 93% and 98% of all matches disappear, as you might expect. After all, you fed the onboarder gibberish for a one of the key fields--the email address.

With Onboarder 1, however, we see a different picture. While they lost 51% of the matches with the addition of 100% gibberish email, they maintained a 37.1% match rate.

I found myself wondering how loosely they turned the matching dial. Do you? Yet, if presented with this data you might say "Wow I need to do business with Onboarder 1."

It gets worse when we look at onboarded CRM data, table 2

As you might expect, the CRM data match rates suggest the IT company's data is of better quality than prospecting lists sourced elsewhere. They may have acquired the CRM data from outbound marketing efforts, inbound queries, and probably also vetted the contact information through their salesforce outreach to those prospects.

And that's what you see. Match rates are much higher across onboarders, ranging from 44% (Google) to 78.2% (Onboarders 1 and 2).

Yet when the data is obfuscated before presentment to the onboarders, you see the same thing happen at Facebook and Google--remember, they see basically everybody and have practically unlimited resources to get good at this--with match rates dropping to sub-1% as with the first case. Onboarder 2 does marginally better, but still loses 95% of all the matches when the HEM is turned to gibberish.

However Onboarder 1 loses only--surprise, surprise--53% of the CRM matches when presented with gibberish HEMs and can still "match" 37.1% of gibberish.

Why is this worse?

The problem gets worse because when you do a crude onboarding test (probably after reading their nonsensical white paper), Onboarder 1 will claim "we can match both prospects and existing customers at the same high rate, enabling your marketing media spend to perform more efficiently."

Said another way, if you run a dumb match rate test, Onboarder 1 wins every time.

领英推荐

Top 5 Benefits of Partnering with a Reliable Business…

Iain Irvin 2 个月前

How To Make Leads Data Work For Your Business

Iain Irvin 1 年前

Byte Me: More About Data

WebEngage 9 个月前

And you lose (money) every time.

And when this "great" onboarder starts to lose match rate tests--maybe to some even more charlatanistic vendor? Simple: They just loosen the dials until they win the test.

TL;DR: What should I do?

This is why you should read academic papers. You can get new ideas and build off what you've just read. (Hmm, maybe we heard something like that in college when we studied the scientific process?)

Whenever you onboard audiences, always take an nth of the audience and salt it by doing something like the authors of the paper, e.g. add random digits to the email addresses before hashing.
Monitor the match rates between your good/non-obfuscated data and your obfuscated data.
Monitor the match rates between platforms and index the results.
Look for relative changes within and across platforms.
Eyeball the records that come back. That means actually looking at the data, not just the reports. See link (3) below for more tips on looking at records.
Ask yourself whether you should ever see match rates of >0% when trying to match gibberish/obfuscated data.
Ask the vendor (not "partner") the same questions above. Do not finalize any onboarding process, use the data, nor pay any invoices until you are absolutely satisfied with the answers.

Many thanks to the authors of the paper. I recommend you not only read their paper, but take a look at some of the articles cited in their work. You may find yet more ideas to improve your marketing performance.

And the good news is you don't have to read any vendor (not "partner") propaganda/whitepapers.

Bonus: Graphs for visual learners

I put tables 1-3 into graphical format in case you work better with images.

Figure 3: Onboarder match rates by list type (Red and blue bars are unobfuscated data; Yellow and green are obfuscated data)

Figure 4: Onboarder match rate index by data type

Links to articles

Linkedin links are not working.

(1) $700bn delusion: Does using data to target specific audiences make advertising more effective? Latest studies suggest not. https://www.mi-3.com.au/26-06-2024/data-delusion-does-using-data-target-specific-audiences-advertising-actually-make

(2) My annotated version of "Is First- or Third-Party Audience Data More Effective For Reaching The 'Right' Customers? The Case of IT Decision-Makers," https://drive.google.com/file/d/1ANE5lUrVUp1jniEHxayO6nKIGuUqrg8O/view?usp=share_link

(3) "Who pulled the names?" Tips for looking at merge/purge and data nths, applicable for match rate test analysis. https://markpilip.com/2020/12/04/who-pulled-the-names/

Mark Pilipczuk

Advisory | Marketer | Board Member | Слава Укра?н?!

8 个月

Linkedin's editing tool wasn't embedding links properly. Here's the post I wrote that triggered this further reading, as well a link to my annotated version of the paper that provided the data I used in my charts and graphs. Credit for all the hard work goes to the original authors: Nico Neumann, Catherine Tucker, Kumar Subramanyam, and John Marshall, as well as to Tom Goodwin whose Twitter post got me started down the rabbit hole. https://www.dhirubhai.net/posts/markpilipczuk_700bn-delusion-does-using-data-to-target-activity-7216824931475431425-Kabi?utm_source=share&utm_medium=member_desktop https://drive.google.com/file/d/1ANE5lUrVUp1jniEHxayO6nKIGuUqrg8O/view

John Lane

I help companies align & engage their teams, driving productivity, reaching business goals, reducing turnover costs—while creating more satisfied employees who fuel long-term growth. Podcaster | Disc Golf | Curious? DM!

8 个月

If you follow Mark Pilipczuk he will do the reading, then explain it to us kids in the back. Thanks Mark!

4 次回应

查看更多评论

要查看或添加评论，请登录

Mark Pilipczuk的更多文章

Mayan Marketing

2025年1月24日

Mayan Marketing

“All you have to know is what it is.” Said Richard Feynman, in his introduction to his famous Sir Douglas Robb Lectures…
The Danger of Low Variance

2024年8月27日

The Danger of Low Variance

Low standard deviation is to be desired in medical procedures, launching rockets, and managing pension funds. But in…

3 条评论
Constraints: A budget season gift

2024年8月12日

Constraints: A budget season gift

It's still budget season across corporate America and we're getting to crunch time. In the first pass of the budget…

6 条评论
Building Marketing Tech Stacks? Forget Fast and Good; Look at Reliability and Schedule

2024年5月23日

Building Marketing Tech Stacks? Forget Fast and Good; Look at Reliability and Schedule

Everybody knows the old engineering adage: Good, Cheap, Fast. Pick any two.
"They Changed Their Control, Right?"

2024年5月21日

"They Changed Their Control, Right?"

Wednesday mornings in the early 90s were an important day to me. After hours of keying green bar reports into Lotus…

16 条评论
Breadcrumbs

2024年4月2日

Breadcrumbs

Regardless of how unique we think the path we've selected is, somebody has probably already tread that path. Probably…
Knowing the names of birds

2024年3月22日

Knowing the names of birds

In today's Daily Stoic is the quote from Heraclitus: "Many who have learned from Hesiod the countless names of gods and…

3 条评论
Stupid Circulation Tricks #.... Ah, I've Lost Count

2024年3月2日

Stupid Circulation Tricks #.... Ah, I've Lost Count

I've likened circulation management to shoveling coal on a passenger ship. The people on the fancy decks go to black…

6 条评论
Be The Idiot In The Room

2017年8月31日

Be The Idiot In The Room

I was in a meeting yesterday where I was the dumbest person in the room. And it felt great! It brought to mind this…

2 条评论
Your Next Audience: IOT Light Bulbs

2017年6月29日

Your Next Audience: IOT Light Bulbs

What are the odds that, tomorrow, you wake up to read in Digiday of an IOT-based botnet that's been defrauding…

See all articles

Match Rate B******t

Mark Pilipczuk

Advisory | Marketer | Board Member | Слава Укра?н?!

I've always hated match rate as a KPI.

Let's see how they did: Prospecting data, table 1

It gets worse when we look at onboarded CRM data, table 2

领英推荐

TL;DR: What should I do?

Bonus: Graphs for visual learners

Links to articles

Mark Pilipczuk的更多文章

社区洞察

其他会员也浏览了

What could a business profit from using Buyer Intent Data?

Lost in Data Lakes: B2B Marketers’ Silent Struggle

What Marketers Need to Know About Zero-, First-, Second-, and Third-Party Data

Pros & Cons: First-Party Data vs. Third-Party Data

How can marketers help make customer data available automatically and fast?

Data in context: The best practices that may not be best practice.

Who's Using Intent Data & How Is It Driving Value?

When’s the Right Time to Capture Data?

WHAT IS INTENT DATA & HOW TO USE IT TO BOOST B2B SALES?

I've always hated match rate as a KPI.

Let's see how they did: Prospecting data, table 1

It gets worse when we look at onboarded CRM data, table 2

领英推荐

TL;DR: What should I do?

Bonus: Graphs for visual learners

Links to articles

Mark Pilipczuk的更多文章

Mayan Marketing

The Danger of Low Variance

Constraints: A budget season gift

Building Marketing Tech Stacks? Forget Fast and Good; Look at Reliability and Schedule

"They Changed Their Control, Right?"

Breadcrumbs

Knowing the names of birds

Stupid Circulation Tricks #.... Ah, I've Lost Count

Be The Idiot In The Room

Your Next Audience: IOT Light Bulbs

社区洞察

其他会员也浏览了

What could a business profit from using Buyer Intent Data?

Lost in Data Lakes: B2B Marketers’ Silent Struggle

What Marketers Need to Know About Zero-, First-, Second-, and Third-Party Data

Pros & Cons: First-Party Data vs. Third-Party Data

How can marketers help make customer data available automatically and fast?

Data in context: The best practices that may not be best practice.

Who's Using Intent Data & How Is It Driving Value?

When’s the Right Time to Capture Data?

WHAT IS INTENT DATA & HOW TO USE IT TO BOOST B2B SALES?