Match Rate B******t

Match Rate B******t

A couple of days ago, Tom Goodwin pointed me to this article(1) in Mi3Australia discussing the abysmal quality of much targeting data. I agree with a lot of the findings, as I wrote here.

What was really interesting was the the links to a couple of research papers, particularly "Is First- or Third-Party Audience Data More Effective For Reaching The 'Right' Customers? The Case of IT Decision-Makers," by Nico Neumann Catherine Tucker Kumar Subramanyam, and John Marshall, published September 10, 2023.

(2) My annotated version of the paper is linked below.

I've always hated match rate as a KPI.

And I don't like ratio metrics unless I absolutely understand and trust both the numerator and denominator. Otherwise, you're just dividing numbers. The paper above by Neumann, et. al., shows us some crystal clear examples of why Match Rate is bullshit, even though that wasn't the intent of the paper.

In the paper, the authors tested purchased records of IT decision makers (ITDM), as well as a marketer's own CRM database of ITDM, and compared the results across four different onboarding platforms. But they did a great test by also obfuscating some of the data before testing at the onboarder.

I've summarized the data in the following tables

Table 1: Overall data, match rates at left with indices at right
Table 2: Onboarder match rates for prospecting data
Table 3: Onboarder match rates for CRM data

The test construction was clever and simple enough that even the most ill-intentioned onboarder wasn't able to corrupt the test. The core element to the test design was the addition of a random 5 digits to the real email addresses before they were hashed and presented to the onboarder. Doing this turns the HEM (hashed email) presented to the onboarder to gibberish.

Let's see how they did: Prospecting data, table 1

As you can see the match rates of a HEM + name varies from 12.3% (Facebook) to 75.2% (the unnamed "Onboarder 1"). It is interesting that Facebook and Google, who basically see everybody and have practically unlimited resources, have lower match rates than the unnamed onboarders, who both dramatically over-perform Google and Facebook.

It's when the emails are salted with the five digits and turned to gibberish that it gets interesting. Both Google and Facebook--who basically see everybody and have unlimited resources--see their match rates drop to sub 1%, as does Onboarder 2. Between 93% and 98% of all matches disappear, as you might expect. After all, you fed the onboarder gibberish for a one of the key fields--the email address.

With Onboarder 1, however, we see a different picture. While they lost 51% of the matches with the addition of 100% gibberish email, they maintained a 37.1% match rate.

I found myself wondering how loosely they turned the matching dial. Do you? Yet, if presented with this data you might say "Wow I need to do business with Onboarder 1."

It gets worse when we look at onboarded CRM data, table 2

As you might expect, the CRM data match rates suggest the IT company's data is of better quality than prospecting lists sourced elsewhere. They may have acquired the CRM data from outbound marketing efforts, inbound queries, and probably also vetted the contact information through their salesforce outreach to those prospects.

And that's what you see. Match rates are much higher across onboarders, ranging from 44% (Google) to 78.2% (Onboarders 1 and 2).

Yet when the data is obfuscated before presentment to the onboarders, you see the same thing happen at Facebook and Google--remember, they see basically everybody and have practically unlimited resources to get good at this--with match rates dropping to sub-1% as with the first case. Onboarder 2 does marginally better, but still loses 95% of all the matches when the HEM is turned to gibberish.

However Onboarder 1 loses only--surprise, surprise--53% of the CRM matches when presented with gibberish HEMs and can still "match" 37.1% of gibberish.

Why is this worse?

The problem gets worse because when you do a crude onboarding test (probably after reading their nonsensical white paper), Onboarder 1 will claim "we can match both prospects and existing customers at the same high rate, enabling your marketing media spend to perform more efficiently."

Said another way, if you run a dumb match rate test, Onboarder 1 wins every time.

And you lose (money) every time.

And when this "great" onboarder starts to lose match rate tests--maybe to some even more charlatanistic vendor? Simple: They just loosen the dials until they win the test.

TL;DR: What should I do?

This is why you should read academic papers. You can get new ideas and build off what you've just read. (Hmm, maybe we heard something like that in college when we studied the scientific process?)

  1. Whenever you onboard audiences, always take an nth of the audience and salt it by doing something like the authors of the paper, e.g. add random digits to the email addresses before hashing.
  2. Monitor the match rates between your good/non-obfuscated data and your obfuscated data.
  3. Monitor the match rates between platforms and index the results.
  4. Look for relative changes within and across platforms.
  5. Eyeball the records that come back. That means actually looking at the data, not just the reports. See link (3) below for more tips on looking at records.
  6. Ask yourself whether you should ever see match rates of >0% when trying to match gibberish/obfuscated data.
  7. Ask the vendor (not "partner") the same questions above. Do not finalize any onboarding process, use the data, nor pay any invoices until you are absolutely satisfied with the answers.

Many thanks to the authors of the paper. I recommend you not only read their paper, but take a look at some of the articles cited in their work. You may find yet more ideas to improve your marketing performance.

And the good news is you don't have to read any vendor (not "partner") propaganda/whitepapers.

Bonus: Graphs for visual learners

I put tables 1-3 into graphical format in case you work better with images.

Figure 1: Prospecting data match rates


Figure 2: CRM data match rates
Figure 3: Onboarder match rates by list type (Red and blue bars are unobfuscated data; Yellow and green are obfuscated data)


Figure 4: Onboarder match rate index by data type


Links to articles

Linkedin links are not working.

(1) $700bn delusion: Does using data to target specific audiences make advertising more effective? Latest studies suggest not. https://www.mi-3.com.au/26-06-2024/data-delusion-does-using-data-target-specific-audiences-advertising-actually-make

(2) My annotated version of "Is First- or Third-Party Audience Data More Effective For Reaching The 'Right' Customers? The Case of IT Decision-Makers," https://drive.google.com/file/d/1ANE5lUrVUp1jniEHxayO6nKIGuUqrg8O/view?usp=share_link

(3) "Who pulled the names?" Tips for looking at merge/purge and data nths, applicable for match rate test analysis. https://markpilip.com/2020/12/04/who-pulled-the-names/




Mark Pilipczuk

Advisory | Marketer | Board Member | Слава Укра?н?!

8 个月

Linkedin's editing tool wasn't embedding links properly. Here's the post I wrote that triggered this further reading, as well a link to my annotated version of the paper that provided the data I used in my charts and graphs. Credit for all the hard work goes to the original authors: Nico Neumann, Catherine Tucker, Kumar Subramanyam, and John Marshall, as well as to Tom Goodwin whose Twitter post got me started down the rabbit hole. https://www.dhirubhai.net/posts/markpilipczuk_700bn-delusion-does-using-data-to-target-activity-7216824931475431425-Kabi?utm_source=share&utm_medium=member_desktop https://drive.google.com/file/d/1ANE5lUrVUp1jniEHxayO6nKIGuUqrg8O/view

回复
John Lane

I help companies align & engage their teams, driving productivity, reaching business goals, reducing turnover costs—while creating more satisfied employees who fuel long-term growth. Podcaster | Disc Golf | Curious? DM!

8 个月

If you follow Mark Pilipczuk he will do the reading, then explain it to us kids in the back. Thanks Mark!

要查看或添加评论,请登录

Mark Pilipczuk的更多文章

  • Mayan Marketing

    Mayan Marketing

    “All you have to know is what it is.” Said Richard Feynman, in his introduction to his famous Sir Douglas Robb Lectures…

  • The Danger of Low Variance

    The Danger of Low Variance

    Low standard deviation is to be desired in medical procedures, launching rockets, and managing pension funds. But in…

    3 条评论
  • Constraints: A budget season gift

    Constraints: A budget season gift

    It's still budget season across corporate America and we're getting to crunch time. In the first pass of the budget…

    6 条评论
  • Building Marketing Tech Stacks? Forget Fast and Good; Look at Reliability and Schedule

    Building Marketing Tech Stacks? Forget Fast and Good; Look at Reliability and Schedule

    Everybody knows the old engineering adage: Good, Cheap, Fast. Pick any two.

  • "They Changed Their Control, Right?"

    "They Changed Their Control, Right?"

    Wednesday mornings in the early 90s were an important day to me. After hours of keying green bar reports into Lotus…

    16 条评论
  • Breadcrumbs

    Breadcrumbs

    Regardless of how unique we think the path we've selected is, somebody has probably already tread that path. Probably…

  • Knowing the names of birds

    Knowing the names of birds

    In today's Daily Stoic is the quote from Heraclitus: "Many who have learned from Hesiod the countless names of gods and…

    3 条评论
  • Stupid Circulation Tricks #.... Ah, I've Lost Count

    Stupid Circulation Tricks #.... Ah, I've Lost Count

    I've likened circulation management to shoveling coal on a passenger ship. The people on the fancy decks go to black…

    6 条评论
  • Be The Idiot In The Room

    Be The Idiot In The Room

    I was in a meeting yesterday where I was the dumbest person in the room. And it felt great! It brought to mind this…

    2 条评论
  • Your Next Audience: IOT Light Bulbs

    Your Next Audience: IOT Light Bulbs

    What are the odds that, tomorrow, you wake up to read in Digiday of an IOT-based botnet that's been defrauding…

社区洞察

其他会员也浏览了