Just wanted to know if it is possible to disallow the whole site for crawlers and allow only specific webpages or sections? Is "allow" supported by crawlers like FAST and Ultraseek?
相关问题
- Why google index this? [closed]
- What are recommended directives for robots.txt in
- How to set Robots.txt or Apache to allow crawlers
- robots.txt allow all except few sub-directories
- Rendering plain text through PHP
相关文章
- Rendering plain text through PHP
- Is this Anti-Scraping technique viable with Robots
- Does the user agent string have to be exactly as i
- robots.txt URL format
- Robots.txt file in MVC.NET 4
- Ban robots from website [closed]
- How to stop search engines from crawling the whole
- robots.txt to disallow all pages except one? Do th
There is an Allow Directive however there's no guarantee that a particular bot will support it (much like there's no guarantee a bot will even check your robots.txt to begin with). You could probably tell by examining your weblogs whether or not specific bots were indexing only the parts of your website that you allow.
The format for allowing just a particular page or section of your website might look like:
This (should) prevent bots from crawling or indexing anything except for content under /public/section1