Why New Content Briefly Flickers Out Of Google
This raised the question, "why would the engines drop new content out of the search results for a few hours after it is been in the results for a few days?" I don't know, but let me make an educated guess - sometimes there is a brief gap between pages falling out of a smaller (but quicker to build) index and when a larger (but slower to build) index is finished getting rebuilt with those pages in it.
Google could remove a page from the small index only after it is in the big index, but then it would be in both indices for a while until the small index was rebuilt. This overlap means the small index is larger than necessary, so can't rebuild as quick as is possible, and so won't be as fresh as is possible. So perhaps they try to time it perfectly so their isn't any overlap and isn't any gap. The problem comes that as they crawl faster, grow their indices, add complexity to their indexing or let the intern check in his summer project, it is easy for a small gap to form. So maybe it is just hard to ensure that there is never any gap unless one is willing to waste resources by letting them overlap.
Chas (the developer who sits next to me) manages some indices with a large+small model that, for the record, never has gaps. And he contributes the fact that his large index starts rebuilding at midnight on Friday because load is lighter on the weekend. However, his computers are set to GMT which means it starts Friday 5PM PST. Well, it was a bit after 5PM on a Friday when Jane first noticed Linkscape dropped from Google's SERPS (I received her email at 5:28PM).
So the theory is that Google had two indices that were suppose to go live in the first seconds of the weekend GMT. First was the new large index that added our page. Second was the new small index that dropped our page. Only the small index was on time.
Or at least that is the best theory I can up with. What do you guys think?
www.seomoz.org
published @ October 21, 2008