How do I disallow specific page from robots.txt

2020-02-01 06:51发布

I am creating two pages on my site that are very similar but serve different purposes. One is to thank users for leaving a comment and the other is to encourage users to subscribe.

I don't want the duplicate content but I do want the pages to be available. Can I set the sitemap to hide one? Would I do this in the robots.txt file?

The disallow looks like this:

Disallow: /wp-admin

How would I customize to the a specific page like:

http://sweatingthebigstuff.com/thank-you-for-commenting

标签: robots.txt
4条回答
劳资没心,怎么记你
2楼-- · 2020-02-01 07:02

This is very simple, any page that you want to disallow, just give root url of this file or folder. Just put this into your robots.txt file.

Disallow: /thank-you-for-commenting
查看更多
▲ chillily
3楼-- · 2020-02-01 07:11

robots.txt files use regular expressions to match pages, so to avoid targeting more pages than you intend, you may need to add a $ to the end of the page name:

Disallow: /thank-you-for-commenting$

If you don't you'll also disallow page /thank-you-for-commenting-on-this-too

查看更多
老娘就宠你
4楼-- · 2020-02-01 07:20
Disallow: /thank-you-for-commenting

in robots.txt

Take a look at last.fm robots.txt file for inspiration.

查看更多
我想做一个坏孩纸
5楼-- · 2020-02-01 07:24

You can also add a specific page with extension in robots.txt file. In case of testing, you can specify the test page path to disallow robots from crawling.

For examples:

 Disallow: /index_test.php
 Disallow: /products/test_product.html
 Disallow: /products/     

The first one Disallow: /index_test.php will disallow bots from crawling the test page in root folder.

Second Disallow: /products/test_product.html will disallow test_product.html under the folder 'products'.

Finally the last example Disallow: /products/ will disallow the whole folder from crawling.

查看更多
登录 后发表回答