登录查看更多内容

HUGE Google Search document leak reveals inner workings of ranking algorithm

Raj Nandan (Ranky SEO)

SEO Analyst | Helping Businesses Boost Rankings & Traffic with Proven SEO Strategies | SEO Specialist with Expertise in On-Page, Off-Page & Technical SEO | SEO Solutions for Business Success

发布日期: 2024年5月31日

The documents reveal how Google Search is using, or has used, clicks, links, content, entities, Chrome data and more for ranking.

A trove of leaked Google documents has given us an unprecedented look inside Google Search and revealed some of the most important elements Google uses to rank content.

What happened. Thousands of documents, which appear to come from Google’s internal Content API Warehouse, were released March 13 on Github by an automated bot called yoshi-code-bot. These documents were shared with Rand Fishkin, SparkToro co-founder, earlier this month.

Read on to discover what we’ve learned from Fishkin, as well as Michael King, iPullRank CEO, who also reviewed and analyzed the documents (and plans to provide further analysis for Search Engine Land soon).

Why we care. We have been given a glimpse into how Google’s ranking algorithm may work, which is invaluable for SEOs who can understand what it all means. In 2023, we got an unprecedented look at Yandex Search ranking factors via a leak, which was one of the biggest stories of that year.

This Google document leak? It will likely be one of the biggest stories in the history of SEO and Google Search.

What’s inside. Here’s what we know about the internal documents, thanks to Fishkin and King:

Current: The documentation indicates this information is accurate as of March.
Ranking features: 2,596 modules are represented in the API documentation with 14,014 attributes.
Weighting: The documents did not specify how any of the ranking features are weighted –?just that they exist.
Twiddlers: These are re-ranking functions that “can adjust the information retrieval score of a document or change the ranking of a document,” according to King.
Demotions: Content can be demoted for a variety of reasons, such as:A link doesn’t match the target site.SERP signals indicate user dissatisfaction.Product reviews.Location.Exact match domains.Porn
Change history: Google apparently keeps a copy of every version of every page it has ever indexed. Meaning, Google can “remember” every change ever made to a page. However, Google only uses the last 20 changes of a URL when analyzing links.

Links matter. Shocking, I know. Link diversity and relevance remain key, the documents show. And PageRank is still very much alive within Google’s ranking features. PageRank for a website’s homepage is considered for every document.

This doesn’t prove Google spokespeople have lied about links not being a “top 3 ranking factor” or links mattering less for ranking. Two things can be true at once. Again, we don’t know how any of these features are weighted.

Successful clicks matter. This should not be a shocker, but if you want to rank well, you need to keep creating great content and user experiences, based on the documents. Google uses a variety of measurements, including?badClicks, goodClicks, lastLongestClicks and unsquashedClicks.

Also, longer documents may get truncated, while shorter content gets a score (from 0-512) based on originality. Scores are also given to Your Money Your Life content, like health and news.

What does it all mean? According to King:

“[Y]ou need to drive more?successful?clicks using a broader set of queries and earn more link diversity if you want to continue to rank. Conceptually, it makes sense because a very strong piece of content will do that. A focus on driving more qualified traffic to a better user experience will send signals to Google that your page deserves to rank.”

Documents and testimony from the U.S. vs. Google antitrust trial confirmed that Google uses clicks in ranking – especially with its Navboost system, “one of the important signals” Google uses for ranking. See more from our coverage:

Brand matters. Fishkin’s big takeaway? Brand matters more than anything else:

“If there was one universal piece of advice I had for marketers seeking to broadly improve their organic search rankings and traffic, it would be: ‘Build a notable, popular, well-recognized brand in your space, outside of Google search.'”

领英推荐

How Do Search Engines Work?

David Victor 9 个月前

?? The hot sauce: Google’s search secrets, leaked!

Gupta Media 6 个月前

A Guide To Google Search Ranking Systems

Sanjay Joshi 5 个月前

Entities matter. Authorship lives. Google stores author information associated with content and tries to determine whether an entity is the author of the document.

SiteAuthority: Google uses something called “siteAuthority”.

Google told us something like this existed in 2011, after the Panda update launched, stating publicly that “low quality?content?on part of a site can impact a site’s ranking as a whole.”
However, Google has denied having a website authority score in the years since then.

Chrome data. A module called ChromeInTotal indicates that Google uses data from its Chrome browser for ranking.

Whitelists. A couple of modules indicate Google whitelist certain domains related to elections and COVID – isElectionAuthority and isCovidLocalAuthority. Though we’ve long known Google (and Bing) have “exception lists” when “specific algorithms inadvertently impact websites.”

Small sites. Another feature is smallPersonalSite – for a small personal site or blog. King speculated that Google could boost or demote such sites via a Twiddler. However, that remains an open question. Again, we don’t know for certain how much these features are weighted.

Other interesting findings. According to Google’s internal documents:

Freshness matters – Google looks at dates in the byline (bylineDate), URL (syntacticDate) and on-page content (semanticDate).
To determine whether a document is or isn’t a core topic of the website, Google vectorizes pages and sites, then compares the page embeddings (siteRadius) to the site embeddings (siteFocusScore).
Google stores domain registration information (RegistrationInfo).
Page titles still matter. Google has a feature called titlematchScore that is believed to measure how well a page title matches a query.
Google measures the average weighted font size of terms in documents (avgTermWeight) and anchor text.

The articles.

Secrets from the Algorithm: Google Search’s Internal Engineering Documentation Has Leaked by King on iPullRank
An Anonymous Source Shared Thousands of Leaked Google Search API Documents with Me; Everyone in SEO Should See Them by Fishkin on SparkToro

Update, May 29. Google provided a statement to Search Engine Land. Read our follow-up: Google responds to leak: Documentation lacks context.

Update, May 30. King has written a follow-up article for Search Engine Land:

How SEO moves forward with the Google Content Warehouse API leak

Join Mike King and I at SMX Advanced for a late-breaking session exploring the leak and its implications.?Learn more here.

Dig deeper. Unpacking Google’s massive search documentation leak

Quick clarification. There is some dispute as to whether these documents were “leaked” or “discovered.” I’ve been told it’s likely the internal documents were accidentally included in a code review and pushed live from Google’s internal code base, where they were then discovered.

The source. Erfan Azimi, CEO and director of SEO for digital marketing agency EA Eagle Digital, posted a video, claiming responsibility for sharing the documents with Fishkin. Azimi is not employed by Google.

HUGE Google Search document leak reveals inner workings of ranking algorithm

Raj Nandan (Ranky SEO)

SEO Analyst | Helping Businesses Boost Rankings & Traffic with Proven SEO Strategies | SEO Specialist with Expertise in On-Page, Off-Page & Technical SEO | SEO Solutions for Business Success

The documents reveal how Google Search is using, or has used, clicks, links, content, entities, Chrome data and more for ranking.

领英推荐

更多精彩文章

社区洞察

其他会员也浏览了

A Guide To Google Search Ranking Systems

Breaking: Google Ranking Secrets Leaked! What Was Found?

A massive leak of Google Search documents has revealed the ranking algorithm's inner workings.

Google Search Engine: The Gateway to the World's Information

Google algorithm updates

What Are The Latest Google Changes? Let’s Look Back At 2024.

What is Google Algorithm?

Search Matters - Week of 27 March 2023

In July 2023, Google Authority, Search, and I/O Algorithm Updates were introduced.

Google Algorithm Updates & News: A Complete History

The documents reveal how Google Search is using, or has used, clicks, links, content, entities, Chrome data and more for ranking.

领英推荐

Google Cache Is Fully Disabled

2024年9月26日

What are the Average Monthly Search Volumes for SEO?

2024年9月15日

The Importance of Internal Linking : Enhancing Your Website's SEO Performance

2024年9月14日

Get Quality 10 Local Keywords SEO Services for Just $60/Month for Your Business

2024年9月13日

Effective Local SEO Strategies to Boost Your Business's Search Rankings and Visibility

2024年9月13日

Keyword Research for Your Small Business – Only $5!

2024年9月12日

Exploring the Latest SEO Trends: A Comprehensive Guide for Small Businesses in 2024

2024年9月12日

Innovative SEO Tactics to Build Your Small Business Brand, Improve Keyword Rankings, and Increase Revenue

2024年8月31日

SEO Essentials for Small Businesses: How to Attract More Customers Online

2024年8月31日

5 SEO Strategies to Drive Traffic to Your Business

2024年8月29日

社区洞察

其他会员也浏览了

A Guide To Google Search Ranking Systems

Breaking: Google Ranking Secrets Leaked! What Was Found?

A massive leak of Google Search documents has revealed the ranking algorithm's inner workings.

Google Search Engine: The Gateway to the World's Information

Google algorithm updates

What Are The Latest Google Changes? Let’s Look Back At 2024.

What is Google Algorithm?

Search Matters - Week of 27 March 2023

In July 2023, Google Authority, Search, and I/O Algorithm Updates were introduced.

Google Algorithm Updates & News: A Complete History