登录查看更多内容

When sampling is OK, and when sampling is NOT OK.

Dr. Augustine Fou

FouAnalytics - "see Fou yourself" with better analytics

发布日期: 2024年11月1日

Advertisers are trying to save costs. They know budgets are getting tighter going forward and they have to do more with the money they have. Sampling is one way to save costs -- i.e. measuring only some of the impressions, not all of them. But the question is "when is sampling OK, and when is it NOT OK?"

When sampling is OK

As a scientist (yes, I have a PhD in Materials Science and Engineering from MIT), I always recommend having complete data. Why? Because when you sample, you run the risk of missing some important details in the stuff you didn't measure. This is even more important in fraud detection, because the fraud could be in the 4 in 5 impressions you DIDN'T measure (1 in 5 sample rate), or the other 9 in 10 impressions you DIDN'T measure (1 in 10 sample rate).

So how do we do this -- sampling -- practically? In FouAnalytics, we recommend always starting off "full bore" which means measuring everything (10 in 10 impressions) and not sampling at the beginning. Then if we see that the data is highly reproducible, as in the following example -- 170 billion pageviews, rock solid repeatable day after day -- then we can start sampling. Because the data is highly reproducible, the risk is lower if we start to sample.

If the data is highly variable, it's not a good idea to sample, because we could easily miss important things, for example the following. The yellow means search crawlers, the orange means declared bots, and the red means bad bots. In the second chart below, note how large the green spike is, and how short-lived it is. If we were sampling, we could have missed seeing that large bot attack entirely.

When sampling is NOT OK

Many advertisers don't know that their current legacy fraud verification vendor is sampling. Why don't advertisers/buyers know this? It's because they are being charged "full bore" even if the measurement is being sampled - 1 in 100, or worse. You don't have to believe me. Ask your current verification vendor for a report that shows you the number of impressions you were charged for AND the number of impressions they actually measured. So, so simple, right? Right, but the advertisers paying for these services never thought to ask this question.

"Ask your current verification vendor for a report that shows you the number of impressions you were charged for AND the number of impressions they actually measured."

In the case of fraud verification, if they are not measuring 99 out of 100 impressions, don't you think it's super easy for the bots and fraud to hide in the 99 out of 100 and get away with it? In other words, what was not measured obviously couldn't be marked as IVT. But not getting marked as IVT doesn't mean it wasn't IVT. Think about this for a second. It WAS bots and fraud and invalid traffic even though it didn't get marked as such. It's because the vendor didn't even measure it. So the 1% that the legacy vendors have been reporting for the last 8 years, is not all the fraud there is; it's all the fraud that they could detect. These are the numbers that TAG and ANA are citing in their press releases, which ultimately misleads advertisers into thinking the problem of fraud is low, when fraud is at its highest levels ever.

领英推荐

Click fraud detection in FouAnalytics

Dr. Augustine Fou 3 个月前

Respondent survival bias in online sampling, data…

Vivek Bhaskaran 1 年前

If It's Algorithms vs Algorithms, You'll Lose. Here's…

Dr. Augustine Fou 3 年前

Finally, advertisers, agencies, and publishers are realizing that these legacy verification vendors have been severely underreporting the fraud and brand safety issues. It's not just me saying it any more. Isn't it time you asked your verification vendor to show you what they actually measured versus what they charged you for? Sampling is NOT OK when it causes you to miss most of the fraud and brand safety issues.

More case examples of screen shots from FouAnalytics: https://www.dhirubhai.net/in/augustinefou/recent-activity/newsletter/

FouAnalytics Practitioners

17,492 位关注者

Joseph Costantini

SME- Retired (1/31/2024)

3 周

Thank you, DR Fou! ??

1 次回应

Ozan Gurel

Certified Fraud Examiner

3 周

Thanks for sharing.

1 次回应

查看更多评论

要查看或添加评论，请登录

查看全部

When sampling is OK, and when sampling is NOT OK.

Dr. Augustine Fou

FouAnalytics - "see Fou yourself" with better analytics

When sampling is OK

When sampling is NOT OK

领英推荐

FouAnalytics Practitioners

17,492 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

Why we need new regulations to protect us from Facebook and Equifax

The Dark Side of Market Research: Online Fraud and Its Impact on Business Decisions

Unveiling Data Breach Mechanics: From Registration to Impersonation - A Deep Dive into Data Privacy Violations

Demystifying the Feared FTC 20 Year Audit

The web giants silently enrolled the creepiest deception techniques that we ever expected due protection...

How to Find Someone's Name by Phone Number for Free: A Comprehensive Guide

Big Data Bad Brother

Understanding Poll Bots and Poll Bot Fraud

Cut that fraud in half with FouAnalytics

We the Exploited: The U.S. Government Buys and Sells Its Citizens for Profit and Power

When sampling is OK

When sampling is NOT OK

领英推荐

FouAnalytics Practitioners

17,492 位关注者

How cookies have been used to commit fraud

2024年11月24日

Rampant gaming of view-through conversions

2024年11月23日

OLV (online video) One advertiser's "oh, f**k" moment seeing FouAnalytics data

2024年11月19日

How to commit ad fraud

2024年11月18日

FouAnalytics exposes overactive Facebook and Google bots on your site

2024年11月16日

COMMON SENSE bot detection, but only if you have the right (Fou)Analytics

2024年11月14日

Havoc with Googlebot, Google Analytics, and Phantom Traffic

2024年11月13日

Hilarious fake clicks on your ads; fake clicks on your sites, shown by FouAnalytics

2024年11月4日

If you have Google Analytics, do you still need FouAnalytics?

2024年11月3日

Ask not what % went to bots, ask what percent was shown to humans

2024年11月2日

社区洞察

其他会员也浏览了

Why we need new regulations to protect us from Facebook and Equifax

The Dark Side of Market Research: Online Fraud and Its Impact on Business Decisions

Unveiling Data Breach Mechanics: From Registration to Impersonation - A Deep Dive into Data Privacy Violations

Demystifying the Feared FTC 20 Year Audit

The web giants silently enrolled the creepiest deception techniques that we ever expected due protection...

How to Find Someone's Name by Phone Number for Free: A Comprehensive Guide

Big Data Bad Brother

Understanding Poll Bots and Poll Bot Fraud

Cut that fraud in half with FouAnalytics

We the Exploited: The U.S. Government Buys and Sells Its Citizens for Profit and Power