Wow! All the useless traffic coming to my website
This is a map of a year's worth of traffic to my vanity page

Wow! All the useless traffic coming to my website

When I was looking for a new role after a decade at Google. I updated my resume and created a vanity page (patrickcopeland.org). Now I have a years worth of Google Analytics data that I looked at today. Over 10,000 unique visitors! I should be happy, but the "noise level" of the traffic is really high. It made me wonder who's driving this traffic and why?

Oh, the (lack of) humanity!

In 12 months, there have been visits from 105 countries (of 196 world wide) and 1545 cities. My guess is that any session lasting less than 2 seconds is a crawler or bot of some kind. I was slightly amazed that my webpage was seeing over 67% of it's traffic from automated sources. The largest amount in descending order has come from: Brazil, Malaysia, Austria, Argentina, Czech Republic, Peru, and Romania. With the vast majority coming from the cities of Sao Paulo and Rio. I'm not sure why Brazil's bots are so aggressive. Below is a map of all cities that hit my page last year.

Bad Referral Sites are also an issue, but a small one

About 5% of the traffic comes from spam referral sites. My site saw 227 referral sites and you can see the top 8 by traffic volume where 3, 5, 7, 8 and are mostly garbage traffic generators.

On top of bots there's a lot of "drive-by" traffic

A bit deeper look, shows that 365 cities have >5sec average session time. The longest average session time from any country was Kyrgyzstan with 12.36 minutes, but that was only one session so we can't draw too many conclusions. Interestingly Pakistan had 14 sessions averaging about 6mins. Uganda had 8 sessions for about 4mins, and Russia 291 for 2.5 minutes.

Hmmm, that last one doesn't seem right. 291 visits? Is this a bot looking for exploits, research for a spear phishing attack, something else? Below is a sorted list of average session time by country.

Another discovery is that the visiting clients are set to 86 different languages (fact: 85% of the world's population is covered by 100 languages). Of the language strings, a few were obviously set for bots and spoofing the client. For instance, the language setting,"Congratulations to Trump and all americans." Hmmm.

Why no love?

There are a few places where I've never seen a visitor. In the US they are, North Dakota and Wyoming. Likewise, no traffic from North Korea, Myanmar, Cuba, Greenland or Madagascar.

In Hopewell, Virginia I've seen 14 visits but a bounce rate of 92.86%. None of the 14 were interested in the site, but that might be because they were really looking for Patrick Copeland Elementary School located there. Their website is intuitively named: https://copeland.hopewell.k12.va.us/.

Identifying the tourists from the rest

That said, I have one big fan in Leatherhead, population 11,316 that spent 10.25mins on my site. Thank you Leatherhead! On a purely per capita basis, my site is also popular in S?o Tomé and Príncipe, a tiny island nation in the Atlantic.

Overall, 84.28% of traffic is desktop based, 12.57% mobile, and 3.14% tablet. And my "audience" has some Affinity Categories that sounds a lot like me...

Idea #1: Gamification

I set out to do a bit of testing to see if I could differentiate *real* people from scrapers, hackers, and bots. I added a progress counter that looked at the amount of time a user spends on each "card" of the site. When a user mouses over a card, I delay 500ms and then start a timer for how long they dwell on each card. This gives me a rough count of the number of milliseconds they have stayed and looked at real content on the site. It also roughly identifies what someone found interesting. As the user reads more content, I increment the counter for that card and for the page overall. I set a goal of 8 seconds for the page and I used Google Analytics to create a goal for incremental progress steps of 10% (doing a ga('send', ...)).

Interesting fact is that this feedback graphic increased time on the site on average from 10 seconds to 120 seconds! The 90th percentile went up by 50x. Apparently, people are motivated to achieve 100% and get rewards. As a "prize", I reveal my email, and a math equation that I will leave as an exercise for the reader to figure out :-).

Idea #2: Give them something to hear

The next idea I had was to add audio to differentiate real from automated traffic. Bots don't usually listen, so if time on the cards goes up when audio is played, maybe it's a real person. I added a "story" button that uses Apple's text-to-speech voices to tell a story related to each card. This increased time on the site by another 50% for non-bot traffic.

Summary

The internet remains a wild and hairy place, but with a few tricks it's possible to increase the value of a page to real people, and to identify useful traffic, killing two birds with one stone. On the positive side, over the last year 33% of the visitors returned and were spending on average 5.5mins on the site. On the negative side, who's driving the other 67% of traffic and why? My site is just like a sensor radio telescope listening to the chatter of the universe. From my couple hours of analysis of the logs, the noise level of the traffic is high, and most of it (by volume) is useless.

Khaled Saber

Zero Trust starts at Identities ?? Tech nerd that found Sales ?? Ex-NASA | Rotarian | Antique Furniture Lover ???

7 年

Very interesting stuff. Thanks for documenting this for the rest of us

回复

要查看或添加评论,请登录

?? Patrick Copeland的更多文章

  • Secret story of Google WiFi: There's nothing you can do that can't be done*

    Secret story of Google WiFi: There's nothing you can do that can't be done*

    Building the right it We can’t imagine a world without WiFi nowadays. We expect it everywhere: work, home, vacation…

    17 条评论
  • Starting at Amazon

    Starting at Amazon

    A few folks on my team have asked me how I feel about starting at Amazon. What differences in culture have been…

    15 条评论
  • An Underappreciated State

    An Underappreciated State

    On a warm day in the middle of October '17, my father and I climbed up the ragged slope of the Catalina foothills…

    8 条评论
  • Old software NEVER dies

    Old software NEVER dies

    I received this mail today..

    5 条评论
  • The art of starting

    The art of starting

    I recently wrote The art of change and The art of forgiving about my transformation from a working at Google for 10…

    2 条评论
  • Are you invisible? 5 brilliant (and obvious) ideas to fix it

    Are you invisible? 5 brilliant (and obvious) ideas to fix it

    Have you ever felt invisible? You work very hard, but the effort or the impact isn't noticed? Not included in an…

    9 条评论
  • It ain't heart surgery!

    It ain't heart surgery!

    At work you hear a lot of people saying that we need to take more risks. Sometimes, I offer my light hearted…

    3 条评论
  • Unbelievable immigration story

    Unbelievable immigration story

    Recently Google took steps for some employees by recalling them to the US, in an attempt to protect them from a new…

    41 条评论
  • Finding Hope in Hell

    Finding Hope in Hell

    Today is International Holocaust Remembrance Day and I thought I'd share thoughts and my photography from Auschwitz. A…

    7 条评论
  • My mom is my best teacher

    My mom is my best teacher

    This is an open letter to my mom on her birthday. I’ve been lucky to have worked for some of the best folks in the…

    6 条评论

社区洞察

其他会员也浏览了