The Bots Continue to Steal Christmas: The Sisyphean Task of  "Dismantling"? Bot Operations
The Matrix (Warner Bros.) and Narcos (Netflix)

The Bots Continue to Steal Christmas: The Sisyphean Task of "Dismantling" Bot Operations

This is Method Media Intelligence, Inc.'s response to the recent disclosure by the US Justice Department with research from Google/WhiteOps. This is long, but I guarantee it’s worth your time if the news story was interesting to you.


While we obviously fully support and work full-time on identifying and ending ad-fraud operations, there is a big disconnect between the effectiveness of identifying one operation at a time and systematically preventing ad-budgets from being spent on this type of web traffic in the first place. It’s always positive news to disrupt a current ad-fraud operation and prevent their monetization, but making an announcement as if this has reduced the scale of the problem is disingenuous.


The reasons for this are the following facts:

  1. The same amount of money will be continue to be spent in digital advertising.
  2. Eliminating one fraud operation makes room for the dozens of others to generate that revenue by compensating for eliminated supply.
  3. This identifying and “dismantling” of an operation is not a systematic process that can be done at scale.


It seems that routinely, particularly towards the end of the calendar year, there is a new disclosure of a "bot operation." The story is often one of mystery and masterful technical deception. The image put in our minds is usually of a hoodie wearing hacker working in front of ra screen with green text flowing across; something out of The Matrix mixed with Narcos.

Here’s our problem with this: we do not think arresting/prosecuting these people is ideal.

I know lots of people who run ad-networks that monetize fake traffic and fake ad-impressions. They’re not criminal masterminds. They are employees of companies that are set up to do these projects and they receive a salary. I personally know, because I used to be one of them from 2012-2014. Stopping the business is more effective than stopping one company. Something like “don’t hate the player, hate the game.”

Similar to Narcos storyline, here we are arresting some foot soldiers of Pablo Escobar. But no action taken against the corrupt law enforcement of major countries (organizations) that facilitates the supply chain to begin with. The difference here is that there are no actual Pablo Escobars or Medellin Cartels in the story of ad-fraud; there are thousands of them all over the world, hundreds in North America itself, and their hands are clean as far as the DOJ/Google/WhiteOps are concerned.

If this becomes precedent, here’s the slippery slope we go down:

  • What if they’re Americans and not Russians?
  • Will we arrest an unscrupulous or oblivious adops person at Disney/ESPN, Conde Nast, Hearst, Vice Media, Turner, etc. for doing this too?
  • What about a media buyer at an ad-agency who buys millions of dollars of counterfeit ad-space for the agency’s clients?
  • At what point does the liability go away?

(If you have answers to the following questions, please let me know).

The basis of my contention with these arrests is that there is no standard definition of what is a bot. Would this be the same for a coordinated site network that funnelled human redirect traffic from porn sites? Do ad-technology companies know the difference?

Counterfeit currency can be proved objectively that it wasn’t produced at the US Mint. Neither Google or Whiteops can do that with all web traffic. We can only prove certain things, and we should stick to those: more on that later in this piece.

What we want advertisers and agencies to truly understand is

  • You cannot wait for this problem to be solved by law enforcement.
  • The tools to protect yourself from a “3ve” like operation already exist!

Let’s start with the breakdown of the various aspects of the operation described in the whitepaper:

"3ve.1: Data center based bot with botnet proxies and hijacked IP's""

Easily detected in 5 milliseconds with a device based check. Let’s check if this device can be operated by a human or not.

"3ve.2: Botnet based counterfeit ad fraud"

Easily check for hidden frames and containers using Intersection Observer. https://www.w3.org/TR/intersection-observer/

"3ve.3: Data center bot"

Easily detected in 5 milliseconds with a device based check. Let’s check if this device can be operated by a human or not.

Unfortunately for our fantasy desiring selves, we LOVE a juicy story. Doesn’t have to be factual, just has to sound real enough based on our understanding from the outside. This aspect of our psychology goes far beyond any adtech related news. Hackers, bot-nets, take-downs, industry cooperation, etc. are all terms that pique our interest.

Now let’s get into a deep dive of why this isn’t as crazy as the US DOJ, Google, WhiteOps, and press outlets are attempting to make it seem. Again, I welcome people to refute my points. That’s how we’ll eventually have a real discussion about these things.

"One way to bring down bot operations is to blacklist all of their known IP addresses. However, because of the operation's aggressiveness, as well as its ability to rapidly acquire new IP addresses, we realized that a blacklist would only temporarily interrupt 3ve's activity. To take it down permanently, we needed to understand how 3ve was structured and organized, we had to ensure that the operators thought they were going unnoticed in order to observe them and apply our learnings to future security efforts, and we needed to expand our effort beyond Google and White Ops."

This is ignorant at best, and irresponsible at worst. Blacklisting IP's that are known to be residential is a very negligent practice. Non-data center IP's can be residential/company IP's, or mobile cell tower IP addresses. Blacklisting IP's commonly results in preventing ad-delivery to legitimate users. This is exactly why the shift to device based checks is essential because it exonerates each impression, whether it's coming through a data center proxy with both simulated and legitimate devices (like this from google: https://developer.chrome.com/multidevice/data-compression) or residential IP address.

"While many of these IP addresses were acquired via a malware called Miuref and Boaxxe, others were obtained using a procedure called Border Gateway Protocol (BGP) hijacking. The hackers essentially seized huge swaths of corporate and residential IP space by interfering directly with the main Internet routing protocol."

Again, this is an overly complex analysis for something that can be detected by checking one property of the device capability: graphics capability and rendering checks. We have been doing this for two years.

"Like 3ve.1, 3ve.3's bots were based in a few data centers, but it used the IP addresses of other data centers instead of residential computers to cover its tracks. Again, data centers are far more suspicious to advertisers worried about bot traffic, but 3ve's strategy still allowed its operators a good degree of agility by allowing them to find new data centers as soon aas old data centers were blocked."

This is shockingly primitive. 'Blocking a data center' doesn't really mean anything as a phrase. Blocking a data center IP, again, is too blunt of a tool and prohibits ads delivered to legitimate people. A device based check would easily catch this from the start. Legitimate traffic is also routed through data center proxies. People use VPN's from data center providers. Google proxy is an example See here: (https://developer.chrome.com/multidevice/data-compression)

Blocking IP's is a primitive practice that results in legitimate users being blocked while bot traffic operators can switch to a different IP the moment the see a drop in monetization.

Ads.txt usage has been continually suggested as a way to avoid this type of inventory. This is good, and will happen over time but buyers need time to wean themselves off of unauthorized seller inventory. Google will make Ads.txt, authorized sellers only, the default on their DV360 platform the default in a few months. This will have a big impact and get better as more and more DSP’s make this a requirement.

Back to this IMPORTANT point: Bots vs. Human: how do you know the difference? Counterfeit currency can be objectively proven even if initial techniques to spot suspicious pieces are not scientific. There is no behavioral analysis that objectively proves a user is a bot. The ONLY objective check we can do is on the hardware. This article by Dr. David Eagleman about trying to help the EU prevent counterfeit is very helpful in understanding the need for standardization: https://www.ft.com/content/a4b295ca-fe07-11e6-96f8-3700c5664d30

"...3ve.2 used a custom-built browsing engine installed with the Kovter botnet, which had infected hundreds of thousands of computers (~700,000) through malvertising campaigns..." This is also a strange assessment. The 3ve operation was the monetizer of the traffic and operator of the websites. They imply that they botnet and the 3ve operators are connected, but the 3ve operators seem more to be customers of the traffic made available by the operators of the Kovter botnet. Again, these hidden browsers can be easily detected with the use of Intersection Observer (https://www.w3.org/TR/intersection-observer/) which would immediately tell if the pixels have actually been loaded on the screen. If the browser was custom and did not have the Intersection Observer property, we would also see that the browser is not normal and refuses to report a basic metric. This would be flagged immediately as well.

The mitigation of ad-fraud does not have to be an overly complex process. If we stick to objective categories of non-human attributes, like device rather than keystrokes or mouse movements, we will be far more successful. Forcing objective standards on the perpetrators of fraud will give us the advantage in the cat&mouse game, rather than us playing catch up as they continually reverse engineer the behavioral checks and swap IP addresses.

If you have any questions, please feel free to reach out.

The problem with "objective standards" is that if they are publicly known they will be easily gamed.? We detect a lot of bots that make themselves known in ways that are very diagnostic but would be trivial to work around. I cringe every time I read an article on LinkedIn or elsewhere that needlessly elaborates on bot-detection methods. Publicizing this stuff in detail really only helps the bot developers.

Bot detection is easy... convincing advertisers to stop buying it? That's quite a bit harder.? An important thing to remember is that many ad-fraud schemes involve a lot of innocent bystanders who have their internet connections or browsers hijacked.?Agencies contribute to this victimization by buying this traffic to meet quotas. Worse, some agencies are actually party to these scams, selling victim data to DMPs.?

Asif Mammadov

Cloud Engineer at Splunk | Entrepreneur | Doer

5 年
Alexander Clouter

Director at coreMem Limited

6 年

On a technical note for the ad-servers out there, the problem is a hostile publisher can trivially make IntersectionObserver and other Javascript functionality return whatever results they wish.? Most verification vendors do not check for sensor manipulation and seem unaware that this is even possible. A winning and reliable strategy is to catch inventory in a lie; similar in style to police questioning.? For example, with only an image pixel, you can use several mechanisms (learn their limitations!) to detect the remote User-Agent (HTTP User-Agent header, HTTP header ordering, TCP options header, SSL handshake, ...) and flag up mis-matches.? Some cases are legit (mobile phone networks use accelerators, CGNAT, ...) but others are not.? Where there is a descrepency, check for other lies :) Back in the day at Telemetry we did not classify datacenters as fraud as this ignored the wild spread legit inventory that was out there with this signal.? What we did do was blacklist datacenter traffic that also lied about its User-Agent.

Shailin Dhar

Transparent Advertising & Clean Media

6 年

Like I said, they don't see themselves as criminals nor do they behave like it.? https://www.buzzfeednews.com/article/craigsilverman/who-ran-methbot-3ve-ad-fraud?

回复

要查看或添加评论,请登录

Shailin Dhar的更多文章

  • Why Brand Safety Gaps Exist & How to Address Them

    Why Brand Safety Gaps Exist & How to Address Them

    I have never seen more questions about the efficacy and utility of programmatic platform content controls and brand…

    3 条评论
  • Discerning Device Quality in CTV Ads - Red Flags & Hidden Nuances

    Discerning Device Quality in CTV Ads - Red Flags & Hidden Nuances

    Most discussions around the quality of CTV ad supply stays at a very macro level, so I wanted to write about the micro…

    17 条评论
  • Ad-Verification Expert Supports Curated CTV Supply

    Ad-Verification Expert Supports Curated CTV Supply

    As an ad-technology expert, active researcher, advisor/consultant, and more recently, an investor, I have spent a great…

    5 条评论
  • Why Ad-Verification Solutions Should Not Sample Impressions

    Why Ad-Verification Solutions Should Not Sample Impressions

    Introduction - How Sampling is Currently Used in Ad Verification Sampling is used to estimate a characteristic of a…

    6 条评论
  • "AdTech" makes the most money from Ad-fraud

    "AdTech" makes the most money from Ad-fraud

    People continue to ask me why I stress the concept of the tech-tax so much. Why does a true understanding of ad-fraud…

    7 条评论
  • Arbitrage of Ad Impressions

    Arbitrage of Ad Impressions

    Arbitrage is one of the most fascinating concepts in digital advertising over the past several years and still exists…

    12 条评论
  • AdTech is like a Restaurant and you don't want to know how the food gets made.

    AdTech is like a Restaurant and you don't want to know how the food gets made.

    AdTech = Restaurant Brands = People eating the food Agencies/MediaBuyers = Chefs SSP/Networks/Publishers = Ingredients…

    3 条评论
  • Ad-Tech Tax - What bracket are you in?

    Ad-Tech Tax - What bracket are you in?

    When a new technology is introduced to improve the efficiency of a business process, it involves a cost. The cost of…

    10 条评论
  • What's your take on toolbars?

    What's your take on toolbars?

    Toolbar traffic is web traffic that is generated and provided for sale by companies that create browser extension and…

  • Ad-Fraud (Blue Collar vs. White Collar)

    Ad-Fraud (Blue Collar vs. White Collar)

    Given the high amount of fraudulent traffic in the advertising ecosystem, every company doing business in the space has…

    1 条评论

社区洞察

其他会员也浏览了