An Anonymous Source Shared Thousands of Leaked Google Search API Documents with Me; Everyone in SEO Should See Them - SparkToro
๐ Abstract
The article discusses a leaked set of over 2,500 pages of API documentation from Google's internal "Content API Warehouse", which provides detailed insights into how Google's search engine operates, including the use of user signals like clicks, engagement, and browsing behavior to power its ranking algorithms.
๐ Q&A
[01] Authenticity and Verification of the Leak
1. How did the author verify the authenticity of the leaked documents?
- The author reached out to several ex-Google employees, who confirmed that the leaked documents appear to be legitimate and match internal Google documentation they are familiar with.
- The author also consulted with a technical SEO expert, Mike King, who reviewed the documents and confirmed they appear to be authentic.
2. What are the limitations in confirming the authenticity and usage of the leaked information?
- The author acknowledges that the leaked documents do not definitively prove which specific elements are currently used in Google's ranking algorithms, as some features may have been retired or used only for testing.
- The author cautions against making definitive claims about Google's use of particular ranking factors based solely on the leaked information, as the documentation may not reflect the most recent changes to Google's systems.
[02] Key Insights from the Leak
1. What insights does the leak provide about Google's use of user signals and clickstream data?
- The leak reveals details about Google's "NavBoost" system, which appears to use signals like the number of searches for a keyword, clicks on search results, and the duration of clicks (long vs. short) to inform its ranking algorithms.
- The documentation suggests Google utilizes data from Chrome, cookie histories, and pattern detection to combat click spam and evaluate the quality of websites.
2. What does the leak indicate about Google's use of whitelists for certain types of searches?
- The leak suggests Google employs whitelists for websites that should be prioritized or demoted in search results for topics related to COVID-19 and elections, likely to combat the spread of misinformation.
3. How does the leak shed light on Google's use of quality rater feedback in its search systems?
- The leak provides evidence that signals from Google's quality rater program, known as EWOK, are directly incorporated into the search ranking algorithms, rather than just used for training purposes.
4. What does the leak reveal about Google's use of click data to determine the quality and ranking of links?
- The leak suggests Google classifies links into different tiers of quality based on the click data associated with them, with higher-quality links passing more ranking signals than lower-quality links.
[03] Implications for the Search Industry
1. How does the author view the changing importance of traditional SEO factors like content and links?
- The author believes that as Google's systems have evolved, factors like brand recognition, user intent, and navigational demand have become more important than classic SEO signals like content optimization and link building.
2. What is the author's advice for small and medium-sized businesses and newer creators/publishers?
- The author suggests that for most small and medium-sized businesses, as well as newer creators and publishers, SEO may show poor returns until they can establish credibility, navigational demand, and a strong brand reputation among their target audience.
3. What does the author call for in terms of how the search industry reports on and analyzes Google's public statements?
- The author encourages search industry journalists and authors to be more critical in their reporting on Google's public statements, and to not simply repeat them without scrutiny or comparison to evidence that may contradict the company's claims.