If It's Algorithms vs Algorithms, You'll Lose. Here's Why
chart from #FouAnalytics

If It's Algorithms vs Algorithms, You'll Lose. Here's Why

Just like with most things AI (“artificial intelligence”) and ML (“machine learning”) that I have observed so far, the algorithms used in “ad tech” to buy and sell trillions of ads per day are rudimentary and black box. That means they don’t work most of the time and you have no way to audit them to see if they actually worked. The typical output that advertisers get are excel spreadsheets and occasionally dashboards; they show how many billions of ads they bought, how much they spent, and how the ads performed -- typically stated as the number of clicks and in click through rates. The marketer could be given excel spreadsheets that were entirely made up, when no ads were ever run. In a rare glimpse into blatant criminal behavior, Uber’s successful lawsuit against Phunware surfaced an email from an employee that read “let’s spin up some more BS to Uber.”

“Two former Phunware employees had conducted an internal investigation discovering that Phunware had falsely billed Uber for ad clicks they did not deliver. Most of the Uber app installations that Phunware claimed to have delivered were generated by a fraudulent process known as “click flooding.” The ad traffic Phunware brought for Uber came through auto-redirects, which automatically took visitors to an app store, whether the user clicked on the ad or not. Phunware continued its fraudulent practices. As evidenced in discovery, a widescale culture of fraud continued for many months. For example, in an email sent on Oct. 31, 2016, a Phunware employees wrote: “Guys it’s… time to spin some more BS to Uber to keep the lights on.” See: One Of Uber’s Lawsuits Against Ad Fraud Comes Full Circle—They Won


Fraud detection algos fail 99% of the time to detect fraud?

From my research over the years, it is also clear that the algorithms in the fraud detection tech that advertisers pay millions of dollars for fail to detect most of the fraud. How is this possible when DoubleVerify and Integral Ad Science make around $250 million in annual revenue and are valued at $5 billion and $4 billion respectively? The explanation is utterly simple. The botmakers' algorithms are better, and they easily trick the fraud detection algorithms. That is why year after year, these companies report IVT (invalid traffic) numbers in the 1 - 3% range; the latest H1 2021 report shows 0.6% in the U.S. Don’t assume the other 99.4% is valid or good. You should read that as they FAILED to detect anything wrong with the 99.4%; they are only catching 0.6% fraud for you. That would be like you buying a box of cookies, opening the box, and finding that you got only some crumbs of a cookie, not even a whole one. I’d be pissed, wouldn’t you?

No alt text provided for this image

In my research since 2011, I have seen fraudsters openly mock fraud detection tech companies. For example, in 2013 we saw the botmakers faking mouse movements to trick the fraud detection -- e.g. if there were mouse movements the user must be real, right? The fraudsters didn’t just fake the movements, they deliberately drew the satanic symbol with the mouse movements to mock the fraud detection tech companies. They knew they could see it but not do anything to prevent it until it was too late, the ads were already served and buyers would have to try to get their money back for the fraud they could catch.

Bad guys’ algorithms also easily faked viewability measurements, a dimension the ad tech industry spent 7 years trying to solve. You know how simple and widespread viewability measurement manipulation was when even mainstream sites -- Newsweek, IBTimes -- were caught doing it in 2017. Fraud detection also relies on IP addresses to try to detect fraud -- e.g. if the IP address is from a data center, it is a bot; if the IP address is residential, it is not a bot. But here’s why that fails too. Fraudsters disguise their data center bots’ traffic by bouncing it through “residential proxy” services. This makes the traffic appear to be coming from residential IP addresses, even though they originated from data centers. This is enough to trick the fraud detection into not marking it as “invalid.”


Pre-bid filtering algos fail 99% of the time; brand safety tech is rudimentary keyword lists

Many advertisers and ad exchanges pay for pre-bid filtering, which is supposed to prevent any bids and ads going to fake sites and bot traffic, at least in theory. Here’s why pre-bid filtering doesn’t work 99% of the time. When an ad slot becomes available and bid requests are sent out to request bids, every parameter in the bid request is declared. That means, whatever algorithm is sending the bid request could lie on every single parameter and declare it to be whatever the buyer wants to buy. A fake site -- fakesite123[.]com -- would lie about the domain and put some other legitimate domain in that data field; so pre-bid filtering algos will see a legit domain and not stop this form of fraud. Similarly, pre-bid filtering might look for cookies that they have seen before and know to be bots, and prevent bidding. So bots just dump the cookie and get a new one to get around this detection. HUMAN (formerly WhiteOps) claims to look at “15 trillion events per week” but the vast majority of the bid requests are not filtered and are let through because fraudsters easily fake the parameters to appear to be valid.

Finally, brand safety is a hot topic these days and many advertisers have rushed out to pay for brand safety detection tech. Data over the years has shown that such tech is no more than rudimentary "if-then" statements -- "if the page contains the word 'blood' block it." This became acutely evident early in the pandemic when the front pages of nytimes, wsj, and other major news outlets were blocked by brand safety tech because they contained covid-19 related keywords. Further evidence shows that these brand safety tech vendors fail to prevent ads from going on clearly fake, disinformation sites. So advertisers are paying extra for tech that doesn't block ads from funding disinformation, and does block ads from going to legitimate news sites. See: We’ve Known Brand Safety Tech Was Bad—Here’s How Badly It Defunds The News


Bad algos actively exploit good algos to make more money

In addition to easily getting around the fraud detection mentioned above, bad guys’ bots are actively exploiting the algorithms that advertisers use in their programmatic media buys. For example, most campaigns are optimized for performance. But the data signal used to indicate performance is clicks or click through rates (CTRs). The optimization algorithms are simple “if-then” statements -- "if there are more clicks on these ad exchanges and sites, then allocate more budget to them or bid higher," in order to drive up overall performance. So bad guys’ bots simply click at a slightly higher rate than humans do, and the optimization algorithms faithfully shift budget to the fake sites using bot traffic. That’s why I have said in the past that advertisers buying programmatic media are addicted to large quantities of ads, low prices, and high clicks -- all of which are only possible due to fake sites and bot activity.

As we saw above, bots dump cookies and get new one to avoid detection. Did you know bots also collect cookies to earn higher CPMs via retargeting? Retargeting algos used by the good guys assume that if a user visited a site, it represented an expression of interest. So how do bots extract more money from a particular advertiser? Right, by visiting that advertiser's site first, and then getting retargeted with higher CPM ads when they visit cash-out sites. Bots can also pretend to be certain desirable audience segments by visiting collections of sites. For example, bots visit medical journal sites in order make themselves look like physicians, again so they can trick the advertisers' algos into paying higher CPMs to show ads to what they think are physicians, but in fact are bots.

Over the years more and more digital ad budgets have shifted to programmatic channels because of the manipulation done by bots -- so ad dollars flow away from legitimate publishers with real human audiences to fake sites and apps with 100% bot traffic, even though little of it is detected as bots or invalid by the current fraud detection tech.?Oh, and let’s not forget fraudsters can simply write false data into your Google Analytics to make it appear that you got traffic, when they never delivered any traffic. They can also use this technique to claim credit for sales that had already occurred -- making it appear that they caused the sales. See: Fraudsters Cheat By Tricking The Reporting To Look Awesome


So What?

A question that might be on your mind is whether FouAnalytics is also tricked by fraudsters’ bots. The answer to that is, yes, I always assume that they are able to successfully trick my platform -- FouAnalytics. That is why I study the data every day, and have done so for the past 10 years, and add new detections when I see new forms of fraud and fraudulent techniques. Also, FouAnalytics is an analytics platform, not black box fraud detection, so I can show you the underlying data and explain to you why something is marked fraudulent or marked not fraudulent. It also gives you more details so your own common sense will tell you something doesn’t look right. In the simple example below, mobile apps were eating up most of the impressions between midnight and 3a; and all the impressions ran out before noon, leaving no ad impressions to be shown during the rest of the day when humans are active and online.?

No alt text provided for this image

The moral of the story is don’t trust black box algorithms. You have no way to verify if they are doing what they said they would do. Even if they claim they are MRC accredited or TAG certified, it doesn’t mean they can actually detect fraud correctly. Use analytics (your own, plus FouAnalytics if you want) to monitor campaigns with greater detail. That way you can spot what clearly doesn’t make sense or is suspicious and dig deeper into investigating it and optimizing away from the problems. Let me know how I can help further.?


Here are additional articles on that topic?

https://www.forbes.com/sites/augustinefou/2020/06/15/do-certifications-and-accreditations-help-reduce-ad-fraud/

https://www.forbes.com/sites/augustinefou/2020/07/24/how-to-select-a-fraud-verification-vendor/

https://www.forbes.com/sites/augustinefou/2020/11/23/what-your-fraud-detection-vendor-misses/

Guntis Stirna

Marketing communications and AdTech

3 年

Diāna Butirina as per our resent conversation about algos. Not all are born equal.

回复
Hessie Jones

Strategist ? Privacy Technologist ? Investor ? Tech Journalist ? Advocating for Data Rights & Human-Centred #AI ? 100 Brilliant Women in AI Ethics ? PIISA ? Altitude ? Women in VC

3 年

... Amir Feizpour as per our discussion today!

Robert Webster

AI Solutions for Marketing

3 年

Agree

要查看或添加评论,请登录

Dr. Augustine Fou的更多文章

社区洞察

其他会员也浏览了