I've created a new subdomain for all static assets (static.example.com) by creating a new A record and pointing it at the same server with a new IP address and then creating a virtual host with the same DocumentRoot as the main www.example.com site. We've pointed all references for static resources to the static subdomain however all website resources can be accessed via either static.example.com or www.example.com.
The problem is that Google has begun to index html files on the static.example.com subdomain. What would be the best way to prevent Google from indexing files on this domain?
There are several ways to do this. One is using robots.txt
Create a static.example.com.robots.txt
file in the root directory and put the following in it (can't use robots.txt, because its shared with other domains).
This will disallow all spiders incl GoogleBot
User-agent: *
Disallow: /
To ensure that this is served only from the static.example.com
site add the following rule to your .htaccess in the root folder of your site.
RewriteEngine On
RewriteBase /
#if request is on static.example.com
RewriteCond %{HTTP_HOST} ^static\.example\.com$ [NC]
#serve robots.txt from static.example.com.robots.txt
RewriteRule ^(robots\.txt)$ %{HTTP_HOST}.$1 [L,NC]