Can I 'noindex, follow' a specific page using x robots in .htaccess?
I've found some instructions for noindexing types of files, but I can't find instruction to noindex a single page, and what I have tried so far hasn't worked.
This is the page I'm looking to noindex:
http://www.examplesite.com.au/index.php?route=news/headlines
This is what I have tried so far:
<FilesMatch "/index.php?route=news/headlines$">
Header set X-Robots-Tag "noindex, follow"
</FilesMatch>
Thanks for your time.
It seems to be impossible to match the request parameters from within a .htaccess file. Here is a list of what you can match against: http://httpd.apache.org/docs/2.2/sections.html
It will be much easier to do it in your script. If you are running on PHP try:
header('X-Robots-Tag: noindex, follow');
You can easily build conditions on $_GET, REQUEST_URI and so on.
RewriteEngine on
RewriteBase /
#set env variable if url matches
RewriteCond %{QUERY_STRING} ^route=news/headlines$
RewriteRule ^index\.php$ - [env=NOINDEXFOLLOW:true]
#only sent header if env variable set
Header set X-Robots-Tag "noindex, follow" env=NOINDEXFOLLOW
FilesMatch works on (local) files, not urls. So it would try to match only the /index.php part of the url. <location>
would be more appropriate, but as far as I can read from the documentation, querystrings are not allowed here. So I ended up with the above solution (I really liked this challenge). Although php would be the more obvious place to put this, but that is up to you.
The solution requires mod_rewrite, and mod_headers of course.
Note that you'll need the mod_headers module enabled to set the headers.
Though like others have said, it seems better to use the php tag. Does that not work?
According to Google the syntax would be a little different:
<Files ~ "\.pdf$">
Header set X-Robots-Tag "noindex, nofollow"
</Files>
https://developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tag