When I develop a website, I tend to go subdomain crazy. If the site is crazy.net, I probably configure www.crazy.net, admin.crazy.net, static.crazy.net, youare.crazy.net, etc. Some allow concurrent logins as different users such as yourself, an admin account and maybe as a particular live user. Others are to keep your static/media resources cookie free, as well as allow parallel sockets for html/media content.
In my typical use case, all of these subdomains point to the same Apache instance. So you can go to a particular relative URL at any of the subdomains, and you will get the same page. But I certainly don't want Google to index all those subdomains; I want a single canonical domain for the site. I'm hardly an expert in Apache configuration, so it took me an hour to track down the solution to this problem.
# robots.txt @ http://www.crazy.net/robots.txt User-agent: * Sitemap: http://www.crazy.net/sitemap.txt
However, all subdomains should return a different, restrictive robots.txt for the same URL.
# norobots.txt @ http://admin.crazy.net/robots.txt, http://static.crazy.net/robots.txt, etc User-agent: * Disallow: /
Here is a sample of my /etc/apache2/sites-available/default config file that made this happen.
Prior to this, all my directives were in a single VirtualHost. The key revelation on my part was that they don't need to be. Rather, Apache configs support inheritance from a global scope. So my VirtualHosts end up just being what's different between www.crazy.net and any other ServerName.