I was unable to find information about my case. I want to restrict the following types of URLs to be indexed:
website.com/video-title/video-title/
(my website produces such double URL copies of my video-articles)
Each video article starts with the word "video" in the beginning of its URL.
So what I want to do is to restrict all URLs that have website.com/"any-url"/video-any-url"
This way I will remove all the doubled copies. Could somebody help me?
This is not possible in the original robots.txt specification.
But some parsers may support wildcards in
Disallow
anyway, for example, Google:So for Google’s bots, you could use the following line:
This should block any URLs whose paths starts with anything, and contains "video", for example:
/foo/video
/foo/videos
/foo/video.html
/foo/video/bar
/foo/bar/videos
/foo/bar/foo/bar/videos
Other parsers not supporting this would interpret it literally, i.e., they would block the following URLs:
/*/video
/*/videos
/*/video/foo