Google Algorithm Exposed: Key Takeaways From the API Document Leak
A massive leak of API documentation from inside Google’s Search division recently shocked the whole SEO community. It raised concerns about how Google may have been ranking websites for years. This leak revealed over 14,000 ranking factors used by Google and highlighted significant privacy incidents.?
Highlighted various factors Google uses, such as the data It collects to rank websites, how people interact with a site, the importance of links, content quality, site's technical aspects, author information, user behavior data, and many ranking factors. Here’s a detailed breakdown of leaked documentation. Let’s dive into detailed analyses…
Google Algorithm Data Leak What Happened in Short
Over 1000s of pages of internal documents, which appear to come from Google’s internal Content API Warehouse, were released on Github on March 13 by an automated bot called yoshi-code-bot. SEO experts Rand Fishkin, SparkToro co-founder, and Mike King first outlined the existence of the leaked material. The number of factors revealed contradicts Google’s past claims about which metrics they don’t use.?
Key Findings From Google API Document Leak
Ranking Features
The leak reveals 2,596 modules from API documentation, with 14,014 attributes corresponding to various Google ranking factors considered for search results. Google includes clicking rates, Chrome browser data, website size, and DA in its search rankings. Despite public claims to the contrary, links remain a crucial ranking factor.?
Chrome Data
Matt Cutts was previously quoted as saying that Google doesn’t use Chrome data in organic ranking. But, leaked documents mention “Chrome-related measurement attributes” in sections discussing how websites appear in SERPs.?
Domain Authority
Google uses a "siteAuthority" feature, indicating that sitewide authority is measured. However, the leaked documents also mention elements that might be seen as contributing to such a score. This metric evaluates a site domain's trustworthiness, contradicting Google's public denials of using domain authority for ranking factors.
Successful Clicks & CTR matter
The documents emphasize the need for high-quality content and user experience to rank well. Google now relies more on user clicks for search rankings through NavBoost.
Links Still Matter
Despite speculation about the diminishing importance of links, the leak confirms that link diversity, relevance, and PageRank continue to play a crucial role in Google's ranking features.
Domain age
Older domains have an advantage in search rankings. The documentation indicates that domain age is a factor, highlighting the value of maintaining established web properties for long-term strategic planning.
Subdomains as separate entities
Contrary to Google’s public stance, subdomains are treated as separate entities rather than main domain extensions. This implies that if your travel blog is hosted on a subdomain, it needs to establish its credibility independently, affecting your total online strategy.
Natural Language Processing (NLP)
Google's algorithm increasingly understands natural language. Creating content that answers user queries naturally, using better language and relevant keywords, improves rankings.
Demotions
Content can be demoted for reasons, including mismatched links, SERP signals indicating user dissatisfaction with the SERP, customer reviews, geographical location, explicit content, and directly matching domains. This provides insight into Google's efforts to ensure the quality and relevance of search results.
领英推荐
Privacy Incidents
The leaked information contains a variety of privacy violations, including the inadvertent gathering of children's voice data and the exposure of home addresses for carpool users. They know where you go and what you say.
Weighting
The API doc didn't specify any ranking features; they just exist. Their existence underscores the multifaceted nature of Google's algorithm and the need for a holistic SEO approach.?
Modular Ranking System
Google's ranking algorithm uses interconnected microservices instead of one algorithm. Trawler crawls and indexes new web content, Alexandria manages core indexing, Mustang handles scoring and ranking, and SuperRoot processes user queries to retrieve search results.
New Website Visibility
New websites often face SEO ranking issues & challenges, known as the "Google Sandbox" effect. Google has denied the existence of a sandbox, but leaked files suggest otherwise, mentioning a "hostAge" parameter that may impact new sites until they gain trust.
PageRank for Homepages
The leak reveals that every document has its homepage PageRank (the Nearest Seed version) associated with it. This emphasizes the significance of a strong homepage in overall site authority.?
Authorship Importance
Google tracks and stores information about content authors, which can influence content’s ranking. Authorship signals indicating who wrote the content can affect how well the content is seen, particularly in areas that demand a lot of skill and credibility.?
Whitelisting
According to the documents' analysis, Google has whitelists for certain topics. This means that websites appearing in Google SERPs for these search queries must be manually approved. Specific topics, like elections ( isElectionAuthority or isCovidLocalAuthority), have whitelists to prioritize "quality sources."?
Change History
Google keeps a copy of every version of page indexed. That means that Google can know every change you ever made to a page. However, only the last 20 URL changes are used for link analysis, indicating the importance of recent updates in SEO strategies.
Re-Ranking Functions (Twiddlers)
Google's "twiddlers" adjust search results through factors like NavBoost (user clicks), RealTimeBoost (real-time behavior), and QualityBoost (interaction quality).
What This Means for Your SEO Strategy
To Wrap Up
Even with the extensive list of possible ranking elements, the main aim of any company should be to produce valuable, educational content that provides a positive user experience. This data leak exposed the complex interconnection between Google, privacy, and valuable insights into Google's search algorithms. As professionals, we must leverage this opportunity to optimize our site.
Thanks for sharing