54% SEOs Got This Question Wrong
Aquif Shaikh
Founder Of BloggingOcean.com, HostingPeek.com and Hostbits.io | Blogger, Web Hosting Expert, & SEO Geek
Welcome to my SEO newsletter!
Each week I will be bringing you tips, tricks, and tactics to help your website rank higher in search engines.
This week I'll be talking about a poll that I created in one of the LinkedIn SEO Groups.
I asked the SEOs one simple question:
Q. You don't want a page to be indexed. What would be the best option for you to do?
Below were the options for the poll
Pretty straight and simple, right?
But guess what?
54% SEOs got this answer wrong. And the sample size wasn't too small either. A total of 589 participants voted for this poll.
As you can see 22% SEOs think Robots.txt is the best option to make sure a page is not indexed and 32% SEOs think using a combination of blocking through robots.txt and using a no-index tag would be a great idea.
However, only 46% of the SEOs got the answer right that you should use the No-Index Tag ONLY.
Now let us analyze what made those SEOs choose those answers and why it is the wrong method to stop Google from indexing pages.
As you all know, Robots.txt is a file that acts as a directive to all the bots including Search Engine Bots and tells them how the website should be crawled; which pages should be crawled, and which shouldn't be crawled.
Most Search Engine Bots including Googlebot strictly follow the rules in the robots.txt file and stop crawling the website as soon as they see that the page is being blocked by robots.txt.
The obvious logic says, since Googlebot won't be crawling the website, they won't be indexing it.
This was the reason some people picked the robots.txt option in the poll.
However, there is a catch. As per Google, although they will respect the guidelines in the robots.txt file and thus not index the page, but if the page appears anywhere else on the internet, Google will still index the page.
Below is the specific guideline from Google
领英推荐
You can find the guideline at the below link
Watch out for the bolded words below
While Google won't crawl or index the content blocked by a robots.txt file, we might still find and index a disallowed URL if it is linked from other places on the web. As a result, the URL address and, potentially, other publicly available information such as anchor text in links to the page can still appear in Google search results. To properly prevent your URL from appearing in Google search results, password-protect the files on your server, use the noindex meta tag or response header, or remove the page entirely.
So blocking a web page using robots.txt is not a great idea.
Please note, that prior to July 2019, it was possible to add a no-index command in robots.txt itself. However, this option was discontinued by Google in July 2019.
Now, the next question arises, why is the third option i.e Blocking through robots.txt apart from using a no-index tag a wrong answer.
The answer here is simple. When a page appears elsewhere on the internet, Google will try to crawl the page. But the very first thing that Google will do is look up the Robots.txt file.
Now, as soon as it sees that the page is being blocked by Robots.txt, it will stop the crawl, and hence it will not see the no-index tag. Thus it will still end up indexing the page as in the case of no no-index tag.
Such aspects of Technical SEO are very important for all the SEOs to know. However, it is strange how most SEOs get that wrong.
So, the next time someone asks you this question in an interview, answer it with confidence that you should use the no-index tag. The other way to make sure the page is not indexed is to either password-protect it or send a 404 (Content Not Found) or 410 (Content Deleted) error.
If you accidentally got a page indexed, you can 404 or 410 it or mark it as no-index and then either wait for Google to crawl the URL or use the Google URL removal tool to quickly take down the URL
I hope you found this week's SEO newsletter helpful. Stay tuned for next week's edition.
If you have any questions or need help from my side, send me a DM and I will try my best to address your queries.
Also, let me know what you'd like me to include in my future newsletters.
Thank you for reading and until next time, Happy Optimizing!
This article was originally published at the below link