Here is content of my robots.txt file:
User-agent: *
Disallow: /images/
Disallow: /upload/
Disallow: /admin/
As you can see, I explicitly disallowed all robots to index the folders images
, upload
and admin
. The problem is that one of my clients sent request for removing the content from the images folder because .pdf document from the images
folder appeared in the google search results. Can anyone explain me what I'm doing wrong here, and why google indexed my folders?
Thx!
Quoting Google Webmaster Docs
--
Set X-Robots-Tag header with noindex for all files in the folders. Set this header from your webserver config for the folders. https://developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tag?hl=de
Set header from Apache Config for pdf files:
<Files ~ "\.pdf$"> Header set X-Robots-Tag "noindex, nofollow" </Files>
Disable directory index'ing / listing of this folder.
Add a empty index.html with a "noindex" robots meta tag.
<meta name="robots" content="noindex, nofollow" /> <meta name="googlebot" content="noindex" />
Force the removal of the indexed pages by manually using webmaster tools.
Question in the comment: How to forbid all files in the folder?