Digital Marketing Agency | SEO, Paid Social & PPC

The Google-Extended Crawler Documentation

Share This Post

The documentation for Google’s web crawler user agent, Google-Extended, has been recently revised. This update reflects changes in product naming and provides clarification on the potential impact on search results. This information may be of concern to individuals who opt to block the crawler. The revised documentation now offers more explicit guidance on managing content access, especially for those using it in AI model training.

The Google-Extended Crawler Documentation

Google-Extended User Agent

Launched on September 28, 2023, Google-Extended provides web publishers with a user agent for managing the crawling of their websites. Publishers have the option to allow or deny the Google-Extended user agent through the Robots Exclusion Protocol, granting them the ability to choose whether their content is included in AI training datasets or not.

How To Control Bard and Vertex AI Training Data Access on Your Websites

Google refers to Google-Extended as a “standalone product token,” though this phrasing deviates from the standard terminology familiar to publishers in terms of User Agents.

The initial announcement detailed the characteristics of the new user agent:

“Today we’re announcing Google-Extended, a new control that web publishers can use to manage whether their sites help improve Bard and Vertex AI generative APIs, including future generations of models that power those products.

By using Google-Extended to control access to content on a site, a website administrator can choose whether to help these AI models become more accurate and capable over time.”

To block Google-Extended, use the “Google-Extended” User Agent:

User-agent: Google-Extended
Disallow: /

Google ChangeLog

Google maintains a log of significant updates related to guidance and interactions with web publishers and the search marketing community. A recent entry in Google’s developer pages announced a modification to the Google-Extended documentation.

This update follows the renaming of Bard to Gemini Apps and specifies that the indexing performed by Google-Extended now feeds into Gemini Apps and Vertex AI generative APIs. The revised language aims to reassure publishers that this change does not impact Google Search, thereby addressing potential concerns regarding the consequences of opting out of Google-Extended’s AI data collection.

How to Optimize Crawl Budget for SEO

What is Change?

Google’s Changelog clarifies that the crawling performed by Google-Extended is specific to Gemini Apps and has no bearing on Google Search.

The Changelog advises:

“Updated the description of the Google-Extended product token
What: With the name change of Bard to Gemini Apps, we clarified that Gemini Apps is affected by Google-Extended, and, based on publisher feedback, we specified that Google-Extended doesn’t affect Google Search.”

The revised instructions have replaced the Bard brand name with Gemini. Additionally, the following sentence has been included:

“Google-Extended does not impact a site’s inclusion or ranking in Google Search.”

Would you like to read more about “Google-Extended Crawler Documentation” related articles? If so, we invite you to take a look at our other tech topics before you leave!

Use our Internet marketing service to help you rank on the first page of SERP.

Subscribe To Our Newsletter

Get updates and learn from the best