I have www.domainname.com, origin.domainname.com pointing to the same codebase. Is there a way, I can prevent all urls of basename origin.domainname.com from getting indexed.
Is there some rule in robot.txt to do it. Both the urls are pointing to the same folder. Also, I tried redirecting origin.domainname.com to www.domainname.com in htaccess file but it doesnt seem to work..
If anyone who has had a similar kind of problem and can help, I shall be grateful.
Thanks
You can rewrite
robots.txt
to an other file (let's name this 'robots_no.txt' containing:(source: http://www.robotstxt.org/robotstxt.html)
The .htaccess file would look like this:
Use customized robots.txt for each (sub)domain:
Instead of asking search engines to block all pages on for pages other than
www.example.com
, you can use<link rel="canonical">
too.If
http://example.com/page.html
andhttp://example.org/~example/page.html
both point tohttp://www.example.com/page.html
, put the next tag in the<head>
:See also Googles article about rel="canonical"