(don't wanna link directly there - just feel it's a little dirty linking directly via these forums)
Following these guidelines will help Google find, index, and rank your site. Even if you choose not to implement any of these suggestions, we strongly encourage you to pay very close attention to the "Quality Guidelines," which outline some of the illicit practices that may lead to a site being removed entirely from the Google index or otherwise impacted by an algorithmic or manual spam action. If a site has been affected by a spam action, it may no longer show up in results on Google.com or on any of Google's partner sites.
This is right there in Google Webmasters(!), so wasn't hard to research. I'd always assumed there must be some manual curation when it comes to de-indexing to both help the algo, and to prevent (or at least limit) false positives. And of course, let's not forget, they have to manually deal with reinclusion requests too (or do some think THAT is 100% automated too?!). And yes, I've had one or two sites de-indexed in the past, and included again through a very manual process that involved manual reviews to re-include my sites once more (confirmed by emails sent to me by Google).
So given that, it's hardly a leap to assume there's some manual curation going on with with data pushes / Panda updates to both help the algo next time around, and ensure their targets are pinpointed. Algorithms are human-lead after all, and they rely on new data and rules set by humans. I don't see how that's possible without some level of manual review to "show it the way" and flag up similar sites next time.
Now before some people get all angry about this and think it's "either / or", of course I am not suggesting ALL sites that are de-indexed are done so manually. I don't doubt there's a threshold in their algo that will draw a line between automatic de-indexing and "send for manual review / algo refinement", and I've no doubts that a lot of "sure fire" spam sites get deindexed automatically. But to suggest every site that is de-indexed on a data push is automatically done so is a bit naïve given the complexity of the web and the fact that Google want to limit their own mistakes to a minimum.
I suppose those in denial will say "well Google would say that" when they say they perform manual reviews.
I just think it's healther to assume Google are telling the truth on this, and create sites as if they can stand up to a manual review.
Edited by lovethelink, 04 October 2011 - 12:47 AM.