Checking PDF indexing with URLinspector
Christoph C. Cemper
Founder of AIPRM, RAGIDX, LinkResearchTools (LRT), Link Detox, URLinspector
Using a simple "URL Ends With" Filter in (6) we can see a couple interesting things in URLinspector.
(1) All selected URLs (the PDFs) are indexed OK
(2) One URL is reported as "Page with redirect".
We find there are A http to https redirect is implemented upon checking with the free Link RedirectTrace browser extension.
You can get that extension that a lot of SEOs love here:
(3) User specified canonical and Google decided Canonical do not match.
Turns out that in fact no Canonical was specified at the content level (not possible in PDF),
but also not in the HTTP HEADER.
This is fine, and we will introduce a new status "N/A" there probably, as "Not matching" sounds like an error, when it is not.
(4) All the 56 PDFs pass the "Index Verdict" as defined in the Google Inspection API
That's great news, and confirms what we see.
We don't have results for Mobile or Rich Items on these URLs, which also makes sense.
领英推荐
(5) In the "Days since Last Crawl"...
We see a chunk of 25 PDFs not crawled for 60 days.
We see a few more frequent, few even slower.
Therefore we can expect Google to take over 2 months to recrawl the PDFs and pick up those http to https changes.
We could maybe get them crawled faster, but it appears acceptable given that it's pure change of protocol.
If the content would need to change, that's a different story.
(6) Column filters can be setup right in the table
(7) "Days since Last Crawl" are abbreviated as "C-Days" in URLinspector
Here's a thread on Twitter explaining some examples we saw some patterns there already.
I hope you found this post interesting.
If you would like to do a similar analysis for your own website, then you can try the free Beta of URLinspector.
You can add all your sites to URLinspector with 1-click.