Hey guys, Before we jump to newsletter I want to share a story which triggered me to write about this topic.!
I was hanging out with some fellow SEOs the other day, swapping tips and tricks like usual. We were all hyped, talking about the latest strategies to get websites climbing the search engine ranks. Then, I casually mentioned checking the "noindex" section in Google Search Console (GSC).
Awkward silence. You know that moment when the conversation just...dies? Yeah, that's what happened. Turns out, a bunch of these smart SEOs barely even glanced at the noindex section. Some admitted it, some looked genuinely confused.
Now, listen up. This blew my mind. The noindex section can be a total game-changer. It's like having a secret weapon in your SEO arsenal – a tool that can show hidden problems stopping your website from getting seen. And yet, most SEOs seem to be leaving it untouched(don't know why?...do they really don't care about it or don't know how to solve it)!
That's why I'm writing this. Because ignoring the noindex section is like having a whole toolbox full of SEO goodies and just using the fancy hammer. Sure, the hammer might get some things done, but you're missing out on the screwdriver, the wrench, the whole kit! The noindex section can help you fix some of the most common issues stopping your website from getting indexed, which is kind of a big deal in the SEO world.
Have glance of few reasons that you might see in your GSC.
The above image lists 14 possible reasons why a website might not be indexed. Let's take a look at each of them:
- Excluded by 'noindex' tag: This means that you have specifically instructed search engines not to index the page. This is usually done for pages that are not meant to be seen by the public, such as login pages or thank-you pages.
- Blocked by robots.txt: Robots.txt is a file that tells search engines which pages on your website they should not crawl. If a page is blocked by robots.txt, it will not be indexed.
- Alternative page with proper canonical tag: This means that there is another page on your website that is considered to be the "canonical" version of the page in question. Search engines will typically only index the canonical version of a page.
- Page with redirect: If a page has a redirect in place, it can sometimes confuse search engines and prevent them from indexing the page.
- Not found (404): This means that the page does not exist on your website. Search engines will not index pages that return a 404 error.
- Blocked due to access forbidden (403): This means that the page is forbidden from being accessed by search engines. This could be due to a number of reasons, such as password protection or IP restrictions.
- Server error (5xx): This means that there is a problem with your web server that is preventing search engines from accessing the page.
- Soft 404: This is a type of 404 error that is not technically an error, but it can still confuse search engines and prevent them from indexing the page.
- Duplicate without user-selected canonical: This means that there is duplicate content on your website, and Google has not chosen the correct canonical page to index.
- Blocked due to other 4xx issue: There are a number of other 4xx errors that can prevent a website from being indexed. These errors typically indicate that there is a problem with the request that was made to the web server.
- Redirect error: This means that there is a problem with the redirect that is in place on the page. This could be due to a number of reasons, such as a broken link or an incorrect redirect code.
- Crawled - currently not indexed: This means that Google has crawled the page, but it has not yet decided whether or not to index it.
- Discovered - currently not indexed: This means that Google is aware of the page, but it has not yet crawled it.
- Duplicate, Google chose different canonical than user: This means that there is duplicate content on your website, and Google has chosen a different page to be the canonical page than the one that you specified.
Which One to Solve First?
The most important reason to fix first is "Excluded by 'noindex' tag". This is because it means that you are specifically telling search engines not to index the page. If you want the page to be indexed, then you need to remove the 'noindex' tag.
Once you've fixed the "noindex" tag issue, you should then focus on fixing any of the other issues that are preventing your most important pages from being indexed. These might include things like fixing broken links, resolving server errors, and making sure that your robots.txt file is not blocking any important pages.
Which Ones Can You Ignore?
There are a few reasons on the list that you can probably ignore. For example, you can ignore the "Crawled - currently not indexed" and "Discovered - currently not indexed" messages. These simply mean that Google has not yet had a chance to index the page. It may take some time for Google to crawl and index all of the pages on your website.
You can also probably ignore the "Duplicate, Google chose different canonical than user" message. This means that there is duplicate content on your website, and Google has chosen a different page to be the canonical page than the one that you specified. This is not necessarily a bad thing, as Google is usually pretty good at choosing the correct canonical page
Which Can Have Potential to Improve SEO?
Several reasons on the list can significantly impact your SEO if not addressed. Here's how they affect SEO and what you can do to improve:
- Blocked by robots.txt: Robots.txt tells search engines which pages to crawl. Accidentally blocking important pages can prevent indexing. Regularly review your robots.txt to ensure only intended pages are blocked (https://support.google.com/webmasters/answer/6062598?hl=en).
- Alternative page with proper canonical tag: While not necessarily bad, duplicate content with an incorrect canonical tag can confuse search engines. Use clear and consistent internal linking and ensure the most relevant page is designated as the canonical version (https://www.canonicaltag.com/how-canonical-tags-can-help-search-engine-rankings/).
- Page with redirect: Poorly implemented redirects can cause indexing issues. Use clear 301 redirects for permanent page moves and avoid redirect chains (https://moz.com/blog/301-redirection-rules-for-seo).
- Not found (404) & Blocked due to access forbidden (403): These errors prevent search engines from accessing pages. Fix broken links and ensure proper permissions for search engine crawlers.
- Server error (5xx): Frequent server errors signal instability and can discourage search engines from crawling. Address server-side issues with your hosting provider.
- Soft 404: Search engines might interpret dynamic pages with thin content as soft 404s. Enrich these pages with valuable content or consider de-indexing them.
- Duplicate without user-selected canonical: Duplicate content across your site or thin content scraped from other sources can hurt SEO. Consolidate or remove duplicate content and create unique, informative content.
Now, roll up your sleeves and get ready to tackle these issues head-on. Whether it's a sneaky 'noindex' tag or a troublesome robots.txt file, identifying and resolving these problems is crucial. By taking decisive action and fine-tuning your site for search engines, you'll ensure your website doesn't just exist but thrives in search rankings. Don't let your efforts go to waste fight for your visibility and watch your organic traffic soar.