"The (hu)man behind the curtain"? doing bot detection, revealed
Screen cap from Wizard of Oz movie

"The (hu)man behind the curtain" doing bot detection, revealed

Quoting from Twitter vs Elon Musk court document: C.A. No. 2022-0613-KSJM [PDF], emphasis mine.

"New facts have come to light that call into doubt the truthfulness of Twitter’s curiously static representation in SEC filings that less than 5% of its accounts are false or spam.

On April 28, just three days after signing the Agreement, Twitter restated three years of its mDAU ("monetized Daily Active User) numbers, despite never disclosing the issue to Defendants pre-signing. Post-signing, Defendants promptly sought to understand Twitter’s process for identifying false or spam accounts. In a May 6 meeting with Twitter executives, Musk was flabbergasted to learn just how meager Twitter’s process was. Human reviewers randomly sampled 100 accounts per day (less than 0.00005% of daily users) and applied unidentified standards to somehow conclude every quarter for nearly three years that fewer than 5% of Twitter users were false or spam. That’s it. No automation, no AI, no machine learning."

Many of the folks I talked with in recent months have speculated something along the lines of "Twitter is such a big company flush with resources, they've got to have some of the most advanced fraud detection around, especially because they can see all their own data." Based on years of experience studying fraud, I am not so quick to give them the benefit of the doubt. In fact, the best practice is always to NOT assume ability, capability, or expertise, unless proven repeatedly over time, and with supporting data. In most cases where they refuse to show you any data, you can safely assume the opposite -- i.e. that it's more likely to be just "smoke and mirrors" as the saying goes. If a vendor does not give you access to the data, let alone a way to independently verify it, you should not trust the numbers they give you. I am sure you know the other saying which goes something like this "a vendor grading their own homework" and telling you "everything's fine, just keep paying." You should probably not be so trusting.

I tell all my clients, don't take my word for it. Let's look at data together. If I cannot explain why something was marked fraudulent or not fraudulent in a way you can understand, completely, then I am not doing my job and the analytics are not valuable to you.

No alt text provided for this image

If we look at some real data from a real expert on Twitter automated accounts -- conspirator0 -- we can visually see the difference between an automated account (above, blue circles) and chart of my account (below, circles of different sizes). Automated accounts tweet at the same velocity and quantity; they tweet duplicate or repetitive content; they retweet others' content to amplify, etc. Automated accounts can also generate thousands of times more activity than human-controlled accounts. So even if the number of spam accounts is small (it isn't) the amount of activity on Twitter generated by automated accounts is by far, disproportionately larger than the activity generated by human accounts. Give this a quick thought -- how many tweets do you think you can generate per hour, per 5 minutes, per minute? Right now too many. But automated accounts can like, retweet, or tweet thousands of times in that same amount of time.

The following chart from conspirator0 shows tens of thousands of new followers in a short period of time. Thousands of new Twitter accounts can be created en masse too, often with fake profile pictures, using AI generated faces. Data from conspirator0 also shows that many of these accounts never tweet original content; their purpose is to follow, retweet, and like others' tweets to amplify and give them velocity. Ever wonder how easily manipulated the "trending tweets" are? Now you know how they do it.

No alt text provided for this image

To bring this to conclusion, don't assume that Twitter has great technology or expertise in weeding out fake, spam, and automated accounts on their own platform. Court documents gave a rare glimpse into the reality of the situation -- literally a case of Wizard of Oz (small man behind a curtain controlling the special effects. In Twitter's case it was (a few) humans reviewing 100 accounts per day manually and deciding what to label as spam or not.

Similarly, venture funded, private equity owned, and now public fraud detection tech companies may not have any better tech than the above. Don't take my word for it. Run the "fartbot" experiment yourself -- where you re-name your browser "fartbot" and see if they can detect it and prevent the ads from loading. Not only do they fail to detect the most obvious bots; they also detect bots where there are none. These vendors fail to meet even the minimum requirements for accreditation from the MRC for the most basic bot detection. Furthermore, most bots are able to strip out their detection tag, so between 50 - 99% of the impressions are simply not measured. That's why they keep reporting 1% IVT -- hint: that's all they could detect. You all already know this, since I've said this many times over the years already. Why are you still paying for fraud detection, when you can upgrade your processes (e.g. get detailed placement reports) to see more of the fraud yourself, and upgrade your tools (e.g. FouAnalytics) to do more than you were able to when using fraud detection.

Look through the following list of sites from a live campaign. How many mainstream publisher sites do you recognize?

No alt text provided for this image

Also, have a look at the list of mobile apps chomping on ad impressions in a programmatic campaign. Note that PMPs and deal IDs don't prevent these fake publishers and apps from appearing in your campaigns. And note all of the sites above and apps below got past multiple layers of fraud detection - a large DSP uses (blank vendor) across everything; agencies and advertisers use (blank vendor) on all their campaigns. All of these got through and were not marked as invalid.

No alt text provided for this image

Why are you still buying programmatic ads? Why are you still paying for fraud detection tech services?

Michael M. M.

Ad-Fraud Investigator & Media Expert, member of Digital Forensic Research Lab cohort "Digital Sherlocks" - Adding some fun when asking unexpected questions you were not prepared to hear

2 年

Unfortunately, I know many of the listed sites… And these are not the good guys.

回复

Never expect a man to understand something when his job depends on not understanding it..." - the real issue is NOT the % of bot accounts,... its the % of TRAFFIC, stupid ! (not calling Augustine Fou stupid, just re-quoting Jim Carville!) - great to see this all coming out...well done to conspirator0!

Matt Quist

Energy | External Relations

2 年

Excellent rundown. Twitter is in for some interesting times in court I think.

Debbie Reynolds

The Data Diva | Data Privacy & Emerging Technologies Advisor | Technologist | Keynote Speaker | Helping Companies Make Data Privacy and Business Advantage | Advisor | Futurist | #1 Data Privacy Podcast Host | Polymath

2 年

Augustine Fou Twitter and or Musk need you.

要查看或添加评论,请登录

Dr. Augustine Fou的更多文章

社区洞察

其他会员也浏览了