How to Check which URLs Google is crawling more?

How to Check which URLs Google is crawling more?

Now that Google has launched the Crawl Stats report in Google Search Console we finally have a way to see what pages of our site are more popular according to Google.

Google has already said on record that they crawl popular URLs more frequently to keep them fresh in their index.

See for yourself

No alt text provided for this image

Now the good news is that there is a way to find what URLs from your site are more popular according to Google.

There are instances when your site is hosted on a cheaper hosting plan and as a result, you don't get access to your own Log Files that you can analyze on Log File Analysis tools like Screaming Frog Log File Analyzer or SEMrush.

If you suffer from this predicament, then Google Crawl Stats is an option for you, it's not as comprehensive and flexible as Log File Analysis but hey, at least you are getting the last 3 month's data, right?

If they are crawling those URLs more then we can assume that those pages are more authoritative on your website which means you can use this data for internal linking purposes to pass the PageRank and understand why Google assume those URLs to be more popular, you can find commonalities and replicate the blueprint again and again.

Now comes the part you were waiting for, HOW?

It takes more than just Google Search Console to do this the smart way

What you will need?

  1. Google Search Console
  2. Excel

That's it, you just need these two things.

Step 1: Go to Crawl Stats Report and Export the Excel

google crawl stats report

You have to go to Crawl Stats Report Section, click the host that you want to analyze whether it is www.example.com or if it example.com, naturally you would wanna pick the host that has received more crawl requests so that you have more data at your hand.

For example, I am analyzing www.decodedigitalmarket.com because this is a permanent live version of my site which explains why this host has received 14K crawl requests in the last 3 months.

Now export the excel from crawl requests OK (200) to analyze Google Bot behavior on pages that are live and opens on a 200 status code.

Step 2: Keep only landing page URLs and remove the rest

What's the rest?

It is plugin files, feed files, CDN files, and more; this will only get in your way of analysis so you better remove this in excel using filtering options.

No alt text provided for this image

After removing all the unnecessary pages, 382 pages remain

Step 3: Summarize this data with a Pivot Table

To summarize this data with a Pivot Table, first, you have to select this data and create a table of this.

No alt text provided for this image

Hit okay, and a table will be created.

Like this

No alt text provided for this image

Now you have to click on Summarize with PivotTable at the top as you can see it highlighted.

You will see a screen like this

No alt text provided for this image

On the right side, you can see PivotTable Fields, in that first you have to check URL and then Time

as you do that, you get this data

No alt text provided for this image

And that's about what you need.

I can't show the entire sheet data screenshot here, but you get the idea by seeing the above screenshot.

As you can see, the homepage has been crawled by Google Bot on all those dates and at that specific time.

Obviously, it will crawl the homepage more often because of it being the root domain of the website and generally, there are more referring domains to the root domain.

But here you can see how often "Australia Guest Posting Sites" article has been crawled in the last 3 Months

No alt text provided for this image

The crawl requests were 36 to be precise.

The blog posts that I am seeing being crawled often can be assumed as authority pages of my website.

Using the excel sheet you can find the pages you think are important having you worked hard on it that are being crawled the least or haven't been crawled at all.

Either those pages need fixing or else you may need to make the call to remove them altogether.

There is a lot of insights that you can uncover here, for example, instead of going to 200 (OK) you can go to 404 and see how often broken pages are being crawled.

I feel this is one of the activities that you have got to conduct when you just onboard a client and begin to work so as to uncover the most authoritative pages of that site.

You can leverage this data for internal linking and after implementing internal linking, you can return a few months later, export the data again, and see for yourself if the internal linking from high authority to low authority pages did anything to help or not.

Similarly, the impact of external backlinks can be measured with Crawl Stats Report.


要查看或添加评论,请登录

Kunjal Chawhan的更多文章

社区洞察

其他会员也浏览了