How to Make Money Using Fake Android Apps?

How to Make Money Using Fake Android Apps?

Ad fraud. In this case App fraud. Although no Android phone and installed apps are involved. The fraud is completely generated on servers running a Python script.

You might think: “Sure.. but, we have a blacklist, so we’re safe!”. Wrong! Your blacklist doesn’t contain all possible permutations of fake app names. Fraudsters continuously generate new names, and sometimes even use random sequences of characters. Having and managing a whitelist of allowed appnames would be much safer! But, how safe are you really?

About two months ago Dr. Augustine Fou asked me the following question:

How difficult is it to generate or modify a HTTP web request to retrieve a webpage?

After some questions back and forth it boiled down to:

  • Can a bot modify or add a referrer to a request? And if so: How?
  • Can a bot modify or add the App name (Android)? And if so: How?
  • Can a bot modify a bid request? And if so: How?
  • Can a bot modify the domain of the website? And if so: How?

To answer these 4 questions upfront: Yes, this is possible! In this article I will show you how you can change HTTP requests in your browser and how it can be done programmatically.

Dr Fou’s post “60% of digital ad spending going to mobile apps is a bad thing” states that apps generate “30 trillion bid requests per week” [4]. Full stop! Let’s break that down: First, let’s assume this number is worldwide. In digital marketing the USA represents about 1/3rd of the worldwide volume. The USA population is according wikipedia 335 million. So, 10 trillion bid requests divided by 335 million = 29,851 bid requests per week per person. Isn’t that a bid much? Even if one in 10 bid requests is a win that’s still 3,000 requests, or 428 advertisements a day! On average, including babies, elderly people, people working, etc. Isn’t this number a bit steep?

Maybe you remember Dr Fou’s post in August 2024 as shown in Figure 1 [1]? This “bot attack” comprised of ~10,000 request to fouanalytics.com claiming to originate from fake, though humoristic, Android apps. This bot attack wasn’t a real attack, it was an experiment where I wrote a program that fired HTTPS requests to https://fouanalytics.com while changing the Android appname by spoofing the x-requested-with HTTP header. And, yes, those app names were truly random generated.


Figure 1: "Bot attack" on fouanalytics . com showing randomly generated app names. Image taken from Dr Fou's post [1]

This article will explain the basics how I achieved this in laymans’ terms. It gives you a better understanding of how easy it is to manipulate or generate HTTP requests. Your takeaway after reading this is to not blindly trust anything online. Especially when money is involved. And even more when lots of money is involved, like in the advertising ecosystem.

First, I’ll explain how requests look like and how to change HTTP headers in your own browser by intercepting the requests. This shows exactly what happens under the digital hood. The second part is doing the same programmatically using Python. This can be achieved with only a few lines of code and isn’t any magic. Let’s start with the manual one using a browser extension.

Use browser and extension, such as Requestly

What happens when a browser navigates to a website? The browser sends an initial request which contains the URL of the website, the path, and optionally a querystring and cookies. Using a browser extenstion this can be modified easily.

Figure 2: The rules to modify the HTTP request headers with the Requestly browser extension

In Figure 2 can be seen that the a Chrome desktop browser’s User Agent is modified to an Android device, more specifically a Facebook App User Agent. It also adds the X-Requested-With HTTP header to make it appear a real App making the request. The Referer and Origin HTTP headers are set to https://www.facebook.com/ but only if the URL in the location bar contains fouanalytics.com.

So, let’s see how traffic from/to https://www.fouanalytics.com looks like when these HTTP header modification rules are applied during live traffic. Figure 3 shows the network traffic to the main FouAnalytics site including the changed HTTP headers in blue. The HTTP headers starting with Sec-Ch-Ua still have their original values, making it obvious that the request has been tampered with.


Figure 3: Screenshot of www . fouanalytics . com where the in blue highlighted HTTP headers have been modified by Requestl

Modifying HTTP headers is changing only one piece of the total puzzle. It only gives control over the HTTP headers, but not over any low level networking layers (ie. network packets, TLS fingerprinting) and the data captured by the application layer in the browser using JavaScript. That requires more than just modifying HTTP traffic.

But, remember and realize: When you’re advertising in CTV no JavaScript detection can be run. So, if the only detection mechanism is the user agent and the reputation of the IP address you’re easily being conned.

Requestly is a great tool to explain and to show what really happens and how it happens. But, it doesn’t scale! Each ad impression only generates a few pennies at maximum. In order to generate enough money to live a luxury life with big houses, swimming pools, private jets, lazy river pool, luxury sports cars you need a LOT of impressions and clicks. Remember: 30 trillion bid requests per week! That's 30,000,000,000,000!

In a previous post the different types of fraud have been described: request based, browser automation and human operated [1]. Request based is the cheapest way to scale, because it has less overhead. Browser based is way more expensive as browsers allocate and need a lot of memory and CPU. Human operated click fraud is even more expensive. So, let’s quickly go to request based because there is where it happens.


Use Python to automate HTTP requests

As mentioned before, request based automation is lightweight in terms of using resources (CPU and memory). Everyting is command-line based, which means less overhead and thus you can run many instances concurrently.

Figure 4 contains the source code of a minimal Python program that fires HTTPS requests to web servers. These requests are generated out of thin air and contain everything needed to look legitimate at the HTTP level. As a reminder: HTTP requests consists of a method eg. GET, PUT, headers and a body [3].


Figure 4: Python source code. The snippet shows the initialization of the variables and lists used to generate HTTP(S) requests.

The code in Figure 4 shows the loading of the wordlists ie. the nouns and adjectives (line 25-26), configuration of the HTTP request (line 28-36), the HTTP headers (line 49-62), the user agent (line 36), and the empty cookie (line 48).

Figure 5: Python source code. The code snippet shows the generation of fake appnames and the generation, firing and logging of the HTTP(s) requests

The code in Figure 5 generates a list of 100 fake random Android appnames (line 74-82). At line 87 a loop starts. This loop picks a random name from the just generated appnames list, clears the HTTP headers dictionary, and starts to repopulate it with the headers from Figure 4 (lines 49-62). At line 94 the fake appname is added. At lines 95-96 the cookie is only added if it has any value. At line 97 the URL is constructured of the individual parts. At line 98 the request is fired based on this URL, the querystring parameters, the HTTP headers and a callback function which enables you to capture the response. Last line at 99 prints a logline with time, the HTTP status, method, URL, and the appname used.

This tiny Python script, less than 100 lines, is a boiled down version of the code I used early August. It shows exactly which steps to make in order to construct and fire requests programmatically. And: Yes, these requests don’t match a real Android’s webview TLS fingerprint, they don’t use a mix of residential and mobile proxies, and it loads only the main page of www.FouAnalytics.com. But, if you know how to make code like this by yourself, you also know how to improve this and make it appear like the real thing.

Figure 6: The output of the Python script. The python command with script name are marked in orange. The output starts with an epoch time, the reported HTTP status, the method, the URL, and the used Android appname

When the Python script is executed three times it generates the following output, see Figure 6. You can see in Figure 5 at line 99 how the output is generated and at each the end of the line are the generated appnames.

To answer the questions at the start of this article. Can a bot, in this case a python script, generate HTTP requests out of thin air, having a freely configurable: domain name, referrer, appname, user agent and HTTP headers with webview, payload, etc. : Yes, of course they can.

The last two questions: Can a bot generate a prebid request? Load the advertisement? Fire the completion pixels? By now you should have been convinced that this is perfectly possible. That also means that once perfected scaling a tech stack like this can be leveraged into making a lot of money. Guess who's indirectly paying for that? The advertisers... ? Nope: You! Because ad fraud means wasted money which is passed on their customers... you. This means less competitive pricing.

Finally, knowing what's technically feasible, the ridiculous amount of prebid requests, and who's paying for it, what can be reasonably done about this?

  • Start measuring the quality of your ads, and the quality of visitors arriving at your landing pages [5].
  • Perform analytics to see where/ what/ when/ what causes low quality [6].

Measuring and detecting ad fraud is the first step. Analytics the second. The third step is to look at your contracts and see whether you are eligible for refunds or credit traffic in case of fraud, if not then make sure it's included when renewing them. The operational step is to manage the traffic quality and mitigate its impact.

If you have any questions, would like to improve your digital marketing results, suggestions or specific requests feel to connect, comment or DM.


#adfraud #bots #CMO #digitalmarketing #browserautomation #python #analytics

[1] https://www.dhirubhai.net/posts/augustinefou_another-bot-attack-on-my-site-last-time-activity-7229601769884954625-UEC6?

[2] https://en.wikipedia.org/wiki/List_of_HTTP_header_fields#Common_non-standard_request_fields

[3] https://developer.mozilla.org/en-US/docs/Web/HTTP/Messages

[4] https://www.dhirubhai.net/pulse/why-60-digital-ad-spending-going-mobile-apps-bad-thing-fou-bwbie/

[5] https://www.dhirubhai.net/posts/kouwenhovensander_adfraud-bots-cmo-activity-7245048677835186178-c6vk

[6] https://www.dhirubhai.net/posts/kouwenhovensander_adfraud-bots-cmo-activity-7247590450545455105-Wa3L

Rodrigo Martinez

Tech Leader | Digital Transformation - AI - Cybersecurity | Growth & Exit Strategist | M&A | Startups | Ex Co-Founder @ hpG and STI Internet (both exited) | CEO - CIO - CTO - CMO | Nexialist | Polymath | Autodidact | SDG

1 个月

"Welcome, to the real world." - Morpheus.

Michael M. M.

Ad-Fraud Investigator & Media Expert, member of Digital Forensic Research Lab cohort "Digital Sherlocks" - Adding some fun when asking unexpected questions you were not prepared to hear

1 个月

Great article. I use for research reasons browser extensions that randomly changes user agents (mobil, desktop, iOs, etc.) based on parameters I set. I do this f.e. to see if different bid-requests and ads are coming in, or when f.e. checking prices for travel arrangements, where platforms charge you more, if you are f.e. an iOS user.

要查看或添加评论,请登录

Oxford Biochronometrics的更多文章

社区洞察

其他会员也浏览了