My client expects an enormous amount of traffic so we're going to set up multiple servers at hosting companies around the country. The main server (www.mysite.com) will randomly redirect people to one of the other servers (www1, www2, etc) where they will do their shopping. This will spread the load around.
My problem is that we want Google to be able to spider the main server (www). I can easily put robots.txt files on the other servers so that Google doesn't spider, for example, www47 but how can I make sure that Google is able to traverse the "www" machine while real customers are sent to the mirror machines?
I'm guessing that I can put a link to my content on the page that redirects real users but I don't want Google to think that I'm trying to pull a stunt (because I'm not). They penalize you if they think you are trying to get good rankings by seeding one type of content but redirecting to other content. I'm wondering if any of you have successfully done something like this and what's the correct approach.
I know this isn't a PHP question but any pointers would be greatly appreciated - even if it's just a pointer to a forum that's better suited for this question. Thanks!