Log level data - should you ask for it or trust it?
Should you ask for log level data? Yes.
Should you trust log level data? No.
Why? Because log level data can be tampered with or entirely fabricated.
For example, Uber suing 100 mobile exchanges for outright fraud, specifically for falsifying the log level data and transparency reports to make it appear that ads ran on legit sites, when ads ran on porn and piracy sites or didn't run at all. In another fraud case that Uber already won, court documents show that an employee of Phunware (NASDAQ: PHUN) ?wrote: “Guys it’s… time to spin some more BS to Uber to keep the lights on.” Log-level data was entirely fabricated when no ads were run.
So, should you ask for log level data? Yes. This is so you can see whether the adtech vendor you are buying from has modern technology, has recorded the data with sufficient detail, and can provide it to you. If they don't, or won't, provide you with log level data, run, don't walk, away. There's a reason they don't, or won't, provide you with log level data. Perhaps they have something to hide.
Even if they do provide you with the log level data, should you trust it? No. This is because the vendor has the motive and the means of editing, omitting, or falsifying the data to hide fraud or simply to make them look good (e.g. ads ran on mainstream sites, when they actually did not). So what should you do about this?
Simple. You need a different source of data to use to corroborate the data provided to you in the log files. Let's keep this simple, since most marketers are not also data scientists. Let's start at the high level with triangulating quantities, within the same time range. There's no need to do impression level matching if the quantities don't even match up somewhat closely.
In the example below, we take the quantities of bids won from the data provided by the DSP. This is the number of ads that advertisers bought -- 814,433. Then we compare it to the numbers of ads served from the data provided by the ad server -- 548,509. This quantity should be the same as the number of bids won since the ad buyer has the right to serve the ad for every bid they win. As you can see, the difference is a 33% drop off. The number of ads served was 33% less than what the advertiser paid for (the number of bids won).
Pro tip: Don't believe the media agency claims that you only pay for ads served. No DSP on earth will let you get away with winning the bid and not paying for it. So the agency is lying to you if they insist you are only paying when the ad is served.
领英推荐
In the next example below, you can see that the number of ads served (931k) is only 7% less than the number of bids won (1 million). That's only a single digit discrepancy so we will ignore it for now, since ad tech platforms are really no better than single digit discrepancies. The larger drop off comes from looking at the number of ads that actually rendered on screen, compared to the number of ads that were served. Even though an ad server can send out an ad, the ad may never make it to the device, let alone get rendered on screen. For example, in mobile, where the wireless bandwidth is lower, the ad may not arrive at the device and render on screen before the user leaves the page or scrolls past the spot where the ad was supposed to go. Next time you are scrolling through bored panda, notice the ads and the fact that many of them don't have time to load before you move on.
In this example, data from FouAnalytics shows only 745,514 ads rendered on screen. FouAnalytics' measurement tag only fires after the ad finishes loading. So it is a good proxy for ads rendered on screen. Note that this is not the same as viewability. FouAnalytics does measure viewability too, but that is a different article -- Checking Viewability Measurements with FouAnalytics. Even though an ad finishes loading, it may still not be viewable if the user scrolled past it or left the page. That is what viewability measurements are for. So to wrap up this example, there's a 25% drop off between what the advertiser paid for and the number of ads that actually arrived in the users' devices and had a chance to be rendered on screen.
As you can see, you can already find and diagnose problems at the macro level by triangulating data from different data sources -- e.g. comparing DSP data, to ad server data, to FouAnalytics data. If there are discrepancies at this level already, there is no need to do large-scale data processing like impression-level matching. You need to solve these problems first. If, and only if, you see these data sets roughly match up -- e.g. the quantities are pretty close, within single-digit discrepancies, then it may be worthwhile to do the exercise that PwC did for the ISBA twice -- use log level data to try to match ad impressions from end-to-end through the programmatic supply chain. More details on the 2022 study compared to the 2020 study here -- ISBA 2022 vs 2020 Transparency Studies - What it actually shows. The key take away was that there were improvements, but still only 4% of the impressions could be matched end-to-end using the log level data (61 million / 1.3 billion), up from 2% (31 million / 1.3 billion) in 2020.
Let me peel the onion back one more level. If you ask for domain level reporting -- reports that show you what domain or app your ads went to, you can also simply triangulate quantities to see if there's anything wrong. In the slide above, by comparing the numbers of bids won versus the numbers of ads served per domain, you can easily find the most fraudulent ones. Note on the left side, mainstream sites like weather.com and Spotify have low, single-digit discrepancies. As you read down the column and get to the bottom you will literally see a 100% discrepancy. In the last row, 127k bids were won, but no ads were served. The rule of thumb is that the greater the discrepancy ("drop off") the greater the rick of fraud. On the right side of the slide above you see a campaign targeting low CPM prices. The discrepancies are much larger and the overall discrepancy was 74%. Basically 3 in 4 ads were not served at all, even though the advertiser paid for it. Hopefully these examples show you that there's much you can do without asking for log level data or doing any kind of big data processing.
Ask for log level data to know if your ad tech vendor is up-to-snuff or not, or is trying to hide something from you. Even when you have the log level data, don't blindly trust it. Look for other sources of data that can corroborate it, or refute it. You can also triangulate between data sets to look for discrepancies. You can start at the high level and see if even the macro quantities match up. If they do, then you can delve deeper into the details to troubleshoot further. You don't have to be a data scientist yourself; you don't have to wait for your data science team to do this. As a digital marketer, you can already solve a lot of the problems by asking for the right data and looking more closely at the data yourself, with common sense and a skeptical eye. If something looks funny or suspicious, ask harder questions.
Let me know if I can answer any questions or help further.
Media Director | Digital Media | Ad Fraud Fighter
1 年I know a certain celebrity sponsored platform that refuses to show logs ??
Co-Founder @Synchronicity.co, Inc. & BOS
1 年Regular people (well ok, above regular peeps like me) don't have to understand every thing to know there's something fishy about the digital ads biz. I'm just saying. 1. it promises so much, it's very complex and has to be understood by "rocket scientists" such as Mr. Fou and the data can be adjusted by a few clicks (I'm guessing. 2. I myself, highly suspect digital data is increasingly becoming more, let's say, manipulatable. And it's not to the sponsors' and/or especially to the consumer's advantage. Hey! I've seen product prices jump in my cart if left in overnight. That is total algo BS. I do click on ads if I'm interested. But I also blow by ads quickly because I know that if it's scrolled by slowly, like 3 seconds?...It's counts against the sponsors. 3 seconds is too short for a video ad, IMO. And how can anyone tell where my eyeballs are for static ads? Maybe those are PPI's or per clicks, idk.