Why FouAnalytics works, even without complete measurement

Why FouAnalytics works, even without complete measurement

Let's talk about measurement. This is a long one, but stick with me. Hopefully the visuals and examples make this easy to understand. If not, let me know.

Precision vs Accuracy

Precision means the reproducibility of the result. In the depiction below, precise measurements are tightly clustered, even if they are not on the bulls eye of the target. Accuracy means whether the results are correct or not. In the following depiction, accurate measurements hit the bulls eye. Accurate and precise measurements hit the bulls eye and are tightly clustered in the bulls eye!

The measurement from a legacy fraud verification is neither precise nor accurate. The red rectangle in the spreadsheet below (report from one of the legacy verification vendors that has MRC accreditation) highlights a row which reports "Obfuscated BundleID." There's 110,093,619 ("110 million") monitored ads, but only 9,793,336 ("10 million") were measured with a javascript tag. The other 100 million ads were NOT measured with a javascript tag, but yet 110,077,849 ("110 million") were marked as "Fraud/SIVT Free ads." Further, "obfuscating bundle IDs" is the very definition of SIVT ("sophisticated invalid traffic") according to the MRC's guidelines for accreditation.

The Media Rating Council’s IVT Detection and Filtration Guidelines Addendum states: “The second category, herein referred to as “Sophisticated Invalid Traffic” or SIVT, consists of more difficult to detect situations that require advanced analytics, multi-point corroboration/coordination, significant human intervention, etc., to analyze and identify. Key examples are: [...] Domain and App misrepresentation: App ID spoofing, domain laundering and falsified domain / site location”.

All of the 110 million impressions should have been marked as SIVT because the bundleIDs were obfuscated. So the measurement from this legacy fraud verification vendor was "inaccurate" -- i.e. not correct.

In the other yellow highlighted rows, their measurements are again inaccurate and not precise. For example, example.com carries no ads, and is the domain used by MediaTrust, a cybersecurity vendor that scans ads with their crawler/bot to look for malicious code. 100% of these impressions should be marked as IVT (bots). But yet, only 59% and 48% were properly marked as IVT. The fact that not 100% were marked as bots, when these are obvious bots means this vendor's measurement was not accurate. The fact that there was no consistency or reproducibility (59% vs 48%) of marking the same domain as IVT means this vendor's measurement was not precise either.


Probabilistic vs Deterministic

Probabilistic means something is "probably" a bot or "probably" not a bot. That depends on the probability percentage and a threshold, above which you determine it is a bot, and below which it's not a bot. But the problem with probability is what the threshold percentage you choose for the determination. For example, if the probability is above 70%, you label it a bot; or you can choose the threshold to be 50%, above which it is a bot, and below which it isn't a bot. What about 30% then, or 10%. What threshold do you choose? What probability is the right threshold to accurately label a bot to be a bot? You see why this is a problem?

Deterministic is where you have data to definitively mark a bot a bot, and even show users the supporting data so they understand and agree that it was correctly marked. In the following example, notice the large orange spike. Orange in FouAnalytics means "declared bot" -- one which tells you it is a bot. The supporting data shows you that in the user agent, it says "HeadlessChrome." This is an automated browser that developers use to test websites, and it is a tool used by others to scrape content and do other automated tasks. This is a bot, and it declared itself to be a bot. The other data points show the same video screen resolution, the same window size, and the same ISP name and IP address. This was a homemade bot run from someone's house which uses Google Wifi.

You might still object to the above and say it was an obvious example where the bot told you it was a bot. What about SIVT "sophisticated invalid traffic" where bots are more advanced and deliberately disguise themselves. Can you -- FouAnalytics -- still deterministically detect them and "show your work"? Yes.

In this example, note the super large red spike in the chart above. It's marked 100% dark red (confirmed bots); let's look at the supporting data grids below to understand what is happening. In this case, the bot maker was faking and rotating the referrer (to make it look like the bot came from various places like google, reddit, linkedin, tiktok, etc.). The bot maker also rotated through a list of window sizes from 1920x1080, 1366x768, etc. and even browser version numbers like Chrome 120, 121, 119, 118, 117, 115, etc. But you notice all of the traffic was from the same datacenter (AS212238 Datacamp Limited) and the same IP address (149.40.58.148). This bot maker didn't pay for IP address rotation or residential proxies to disguise the IP address. There are many other factors that confirm this is a bot, but I won't go into detail here.

Suffice it to say, deterministic detection is more accurate than probabilistic AND FouAnalytics shows you the details so you can understand why the bot was marked as a bot, in this case SIVT because of the more advanced things it was doing to disguise itself. Deterministic detection also means you don't have arbitrary thresholds above which it is "probably" a bot and below which it "probably" isn't a bot.


Big data vs "rep sample" (representative sample)

Since someone brought this up yesterday, I will address it here. In the Adalytics report that exposed Forbes for doing shady things -- https://adalytics.io/blog/ads-observed-on-www3-forbes-subdomain, FouAnalytics was listed among the verification vendors "receiving telemetry" on ads that appeared on the www3.forbes subdomain, misrepresented as regular Forbes inventory. The question posed was why did FouAnalytics miss this just like all the other legacy verification vendors missed it. Note also that all the other highlighted vendors have both TAG certification and MRC accreditation, and FouAnalytics had neither.

In this case, Adalytics found a FouAnalytics in-ad tag in ad impressions appearing on www3.forbes. FouAnalytics in-ad tags run postbid to measure where the ads went (e.g. the www3.forbes pages), not in the prebid environment where the mis-representation of the domain occurred. All of the other vendors were accredited by the MRC and certified by TAG for their ability to detect SIVT. And they were selling prebid filtering services to specifically detect and prevent this kind of domain spoofing, documented by Adalytics to have gone on for at least the last 7 years). The same failure to detect was documented in 2022 https://www.wsj.com/articles/ad-tech-firms-didnt-sound-alarm-on-false-information-in-gannetts-ad-auctions-11651665602 FouAnalytics does not sell prebid blocking (because that is useless), and FouAnalytics is an analytics platform and reports to advertisers where their ads went; some of the impressions we measured did end up on www3.forbes. Looking back at the last 3 months (Q1 2024), the following counts are the number of impressions observed by FouAnalytics to have appeared on www3.forbes. Each user_id is a separate in-ad account, and the total is the count of impressions for the last 3 months.

FouAnalytics also detected the mismatch in 2022. When FouAnalytics receives information about the domain or pageurl declared in the bid request (via macros from the DSP or ad server), we compare it to the detected domain/pageurl; if there is a mismatch, it is marked as spoofing:1.

This brings us to the the concept of big data versus "rep sample" ("representative sample"). Once FouAnalytics added in-ad measurement in 2015 (see: FouAnalytics Origin Story), we quickly maxed out the capabilities of normal databases like MySQL. Compared to the pageviews on websites, the quantities of ad impressions were so enormous (10s of billions) and the rate of impressions being detected by our in-ad tag was so large that we had to eliminate databases entirely. Raw data is written directly to S3 (encrypted in motion and at rest). The question is whether we have to analyze the truly massive "big data" or whether we could analyze a smaller sample to get the same insights? If the sample were representative of the larger set of data, then we can. That is why you see 2 donut charts when you log into the FouAnalytics dashboard. The left side donut is called "live data loaded into browser"-- that's the 20,000 data points loaded into your browser and the data that populates the data grids further down the page. The right side donut is "historical total data for date range." That tells you the total quantity of data collected in the date range selected. If the colors in the left and right donut charts appear to be pretty much the same, we know that the sample of 20,000 impressions is representative of the whole.

For example, in the first pair of donut charts above, the dark blue is 24% vs 23% and the dark red is 7% vs 12%. In the second paid, the dark blue is 62% and 66%, and the dark red is 8% and 8%. These are close enough and the 20,000 impressions is representative. In the second pair of donut charts, the dark blue is 11% and 13% and the cark red is 37% and 41%, again close enough to allow us to study the 20,000 sample because it is representative of the whole. Studying the "rep sample" can yield the insights you need so you don't have to deal with unwieldy "big data."


Context is crucial for insights

Once we understand that analyzing a sample of the data may be sufficient, and analysis of billions of data points is not necessary, the next and final question is whether and how we can derive insights from the data to make rational business decisions. In order to derive the correct insights from the data, we need to understand the context, for example the campaign set up. I have a whole article on this already -- https://www.dhirubhai.net/pulse/why-data-scientists-javascript-coders-cant-do-what-i-fou-g68fe -- so I will just hit a few highlights here. When you see something like the surge in red in the following chart, do you need to take action? This is on-site measurement.

Looking at the supporting data, we can easily see it's a bot coming from Amazon data centers, and Microsoft and Cloudflare data centers.

But looking at the referrer, we see it is blank. That means the bot hit the site directly, and didn't come from any paid media source. Furthermore, the urls don't contain utm_source, utm_campaign, etc. corroborating that the bots didn't cause ads to load and click on ads. That means these bots didn't cost the advertiser valuable media dollars. So while these bots are a nuisance on the site, the advertiser doesn't need to take action.

In another example (below), we see a large red spike in on-site measurement. Looking at the supporting data, we could see that it was the web team running a site audit with SiteAuditBot. The digital marketing team didn't have to worry because this was not a bot eating up their digital ad budgets.

Turning to in-ad measurement, context is again crucial to understand if something is fraud or not, or problematic or not. For example, in the slide below, if we see hundreds of impressions per fingerprint (unique device-browser combo), it may be a problem of over-frequency or it may not be, depending on the context. If the campaign setup included frequency caps of 3 per user, then seeing several hundred per user is definitely a problem that needs further action. But if the campaign set up called for 500 per month, and this measurement included data from a month's timeframe, then these counts are exactly right, and within campaign parameters. This is why I said looking at the data in isolation may lead to incorrect insights or actions; context (like the campaign set up) is crucial to interpreting the data correctly and taking the appropriate action.

Finally, for in-ad measurement, users of FouAnalytics can make business decisions even if we don't have 100% measurement. The screen shot below shows the top 10 bad guys "worst offenders" and the types of fraud associated with each domain or app. These worst offenders are sorted by "prevalence" which means the portion of your overall ads that each bad guy is eating up. For example, 6% of your impressions, 5%, 4%, etc. By reviewing the worst offenders that have the highest prevalence and deciding which to turn off, you will have the most significant impact on improving the quality of your media buys.

If a site or app is eating up a few dozen impressions out of millions, it doesn't matter. You won't need to review tens of thousands of sites and apps and add those to your block list. Reviewing the top 10 - 20 bad guys every week or two will be sufficient for you to progressively clean your campaigns. In the slide below, within the first 5 days of getting in-ad data, we reduced the dark red by half -- from 48% to 23% -- by adding the top bad guys (most prevalent bad sites and apps) to the block list.


Let me know if you got this far, and what you think of the above.


Happy Sunday Y'all!


Further reading: 635 other articles on digital marketing, analytics, and ad fraud. https://www.dhirubhai.net/in/augustinefou/recent-activity/newsletter/






Vi Wickam

Managing Partner at Wizard of Ads Online

6 个月

So few people understand the difference between precision and accuracy. I remember learning this distinction in 10th grade chemistry and how elucidating it was. Thank you for continuing to fight for the people who are being abused by fraud!

回复
Rikard Wiberg

CEO at Pace | MMM & Forecasting

6 个月

Another great article! The more I read about your profound insights and experience, the more I want to learn and apply them to my MMM analysis and conclusions. Thank you, Dr. Augustine Fou for taking the time to share your knowledge with us.

Dr. Augustine Fou

FouAnalytics - "see Fou yourself" with better analytics

6 个月

要查看或添加评论,请登录

社区洞察

其他会员也浏览了