Weekly SEO News Jul 1-5

Weekly SEO News Jul 1-5

Google Search Console Delays Are Not a Google Core Update

Source: https://www.dhirubhai.net/posts/googlesearchcentral_we-have-been-experiencing-latency-issues-activity-7213802989692583936-evi3?

There have been rumors suggesting that the recent delays in Google Search Console reporting are indicative of an unconfirmed core update rolling out. However, this is not the case. John Mueller of Google clarified that these delays are unrelated to any core updates, confirmed or unconfirmed.

John Mueller posted on LinkedIn, stating, "To be direct: no, this is not an unconfirmed core update, we'll announce it when it's time."

He further explained that "in general, reporting is independent of the process of making ranking updates."

Since Search Console reporting is simply about reporting, it has no connection to core ranking or ranking updates.

Moreover, confirmed updates would be documented on the Google Search Status Dashboard, unlike reporting delays. It's common to see questions arise whenever there is a Search Console delay, but this is not something new.

You Don’t Need Robots.txt On Root Domain, Says Google

Source: https://www.dhirubhai.net/posts/garyillyes_you-probably-heard-before-that-your-robotstxt-activity-7214278388490907648-Fta5

Google's Gary Illyes shares an unconventional but valid method for centralizing robots.txt rules on CDNs.

In a recent LinkedIn post, Google Analyst Gary Illyes challenged the long-standing belief that a website’s robots.txt file must reside at the root domain (e.g., example.com/robots.txt).

Illyes clarified that this isn’t an absolute requirement and revealed a lesser-known aspect of the Robots Exclusion Protocol (REP).

  • Robots.txt File Flexibility

The robots.txt file doesn’t have to be located at the root domain (example.com/robots.txt).

According to Illyes, it’s permissible to have two separate robots.txt files hosted on different domains—one on the primary website and another on a content delivery network (CDN).

Illyes explains that websites can centralize their robots.txt file on the CDN while controlling crawling for their main site. For instance, a website could have two robots.txt files: one at https://cdn.example.com/robots.txt and another at https://www.example.com/robots.txt.

This approach allows you to maintain a single, comprehensive robots.txt file on the CDN and redirect requests from the main domain to this centralized file. Illyes notes that crawlers complying with RFC9309 will follow the redirect and use the target file as the robots.txt file for the original domain.

  • Looking Back At 30 Years Of Robots.txt

As the Robots Exclusion Protocol celebrates its 30th anniversary this year, Illyes’ revelation highlights how web standards continue to evolve. He even speculates whether the file needs to be named “robots.txt,” hinting at possible changes in how crawl directives are managed.

How This Can Help You

Following Illyes’ guidance can help you in the following ways:

  • Centralized Management: By consolidating robots.txt rules in one location, you can maintain and update crawl directives across your web presence.
  • Improved Consistency: A single source of truth for robots.txt rules reduces the risk of conflicting directives between your main site and CDN.
  • Flexibility: This approach allows for more adaptable configurations, especially for sites with complex architectures or those using multiple subdomains and CDNs.

A streamlined approach to managing robots.txt files can improve both site management and SEO efforts.

Google Warns Of Soft 404 Errors And Their Impact On SEO

Source: https://www.dhirubhai.net/posts/garyillyes_soft-404s-and-other-softcrypto-errors-the-activity-7212228210166509573-c6eN

Google's Gary Illyes warns about the impact of soft 404 errors on web crawling and recommends proper error handling to improve SEO and site efficiency. In a recent LinkedIn post, Illyes highlighted two issues affecting web crawlers: soft 404 errors and other "crypto" errors. These seemingly minor issues can negatively impact SEO efforts.

  • Understanding Soft 404s

Soft 404 errors occur when a web server returns a standard “200 OK” HTTP status code for pages that don’t exist or contain error messages. This misleads web crawlers, causing them to waste resources on non-existent or unhelpful content.

Illyes compared the experience to visiting a coffee shop where every item is unavailable despite being listed on the menu. While frustrating for human customers, this scenario poses a more serious problem for web crawlers. As Illyes explains:

“Crawlers use the status codes to interpret whether a fetch was successful, even if the content of the page is basically just an error message. They might happily go back to the same page again and again, wasting your resources, and if there are many such pages, exponentially more resources.”

  • The Hidden Costs Of Soft Errors

The consequences of soft 404 errors extend beyond the inefficient use of crawler resources. According to Illyes, these pages are unlikely to appear in search results because they are filtered out during indexing.

To combat this issue, Illyes advises serving the appropriate HTTP status code when the server or client encounters an error. This allows crawlers to understand the situation and allocate their resources more effectively. Illyes also cautioned against rate-limiting crawlers with messages like “TOO MANY REQUESTS SLOW DOWN,” as crawlers cannot interpret such text-based instructions.

  • Why SEJ Cares

Soft 404 errors can impact a website’s crawlability and indexing. By addressing these issues, crawlers can focus on fetching and indexing pages with valuable content, potentially improving the site’s visibility in search results. Eliminating soft 404 errors can also lead to more efficient use of server resources, as crawlers won’t waste bandwidth repeatedly visiting error pages.

How This Can Help You

To identify and resolve soft 404 errors on your website, consider the following steps:

  • Regular Monitoring: Regularly monitor your website’s crawl reports and logs to identify pages returning HTTP 200 status codes despite containing error messages.
  • Proper Error Handling: Implement proper error handling on your server to ensure that error pages are served with the appropriate HTTP status codes (e.g., 404 for not found, 410 for permanently removed).
  • Use Google Search Console: Use tools like Google Search Console to monitor your site’s coverage and identify any pages flagged as soft 404 errors.

Proactively addressing soft 404 errors can improve your website’s crawlability, indexing, and SEO.

Appreciate the clarity on recent updates regarding Google Search Console delays and Robots.txt protocols from Google! Understanding these nuances is crucial for refining SEO strategies and maintaining site efficiency. Here’s how these insights can benefit your SEO efforts: - Avoid confusion by distinguishing between reporting delays and core updates in Google Search Console. - Simplify management by centralizing your robots.txt file on a CDN without it residing at the root domain. - Improve crawlability and indexing by addressing soft 404 errors effectively. By staying informed and implementing these best practices, you can optimize your SEO strategy, enhance site performance, and boost search visibility effectively. Don't miss out on these valuable insights! Click the links to delve deeper into these topics and elevate your SEO knowledge.

回复

要查看或添加评论,请登录

Digimetri的更多文章

社区洞察

其他会员也浏览了