Generic Toolbar Indexing Debunk Post
Sometimes people think that the Google Toolbar led to Google indexing a page. Here’s a recent such story, for example, which speculates how urls with the substring “mms2legacy” got indexed. Here’s where I started to disagree: The reason for this [supposedly unlisted urls getting crawled --Matt], explained Ken Simpson, CEO of anti-spam company MailChannels, is that one’s Google Toolbar may be configured to pass URLs that one visits to Google for indexing. “If you run Google Toolbar, it knows pages you visit,” he said.
Folks with great memories may remember that I’ve talked about this before. Back in 2006, both Philipp Lenssen and Google OS did controlled experiments by visiting unlinked deep pages with the toolbar, and both concluded that the toolbar did notlead to those urls being indexed.
It’s good to reiterate this every couple years though, especially as Google has gotten better at finding new pages as it crawls. We get questions like this often enough that we have an FAQ answer about it:
Why is Googlebot downloading information from our “secret” web server?
It’s almost impossible to keep a web server secret by not publishing any links to it. As soon as someone follows a link from your “secret” server to another web server, your “secret” URL may appear in the referrer tag and can be stored and published by the other web server in its referrer log. So, if there’s a link to your “secret” web server or page on the web anywhere, it’s likely that Googlebot and other web crawlers will find it.
Security through obscurity is not a great way to keep a url from being crawled. If you don’t want your content in Google’s web index then we provide a ton of advice on how to prevent that content from getting into Google.
www.mattcutts.com
published @ September 1, 2008