I've Rails apps, that record an IP-address from every request to specific URL, but in my IP database i've found facebook blok IP like 66.220.15.* and Google IP (i suggest it come from bot). Is there any formula to determine an IP from request was made by a robot or search engine spider ? Thanks
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
Robots are required (by common sense / courtesy more than any kind of law) to send along a User-Agent with their request. You can check for this using request.env["HTTP_USER_AGENT"]
and filter as you please.
回答2:
Since the well behaved bots at least typically include a reference URI in the UA string they send, something like:
request.env["HTTP_USER_AGENT"].match(/\(.*https?:\/\/.*\)/)
is an easy way to see if the request is from a bot vs. a human user's agent. This seems to be more robust than trying to match against a comprehensive list.
回答3:
I think you can use browser gem for check bots.
if browser.bot?
# code here
end
https://github.com/fnando/browser
回答4:
Another way is to use crawler_detect gem:
CrawlerDetect.is_crawler?("Bot user agent")
=> true
#or after adding Rack::Request extension
request.is_crawler?
=> true
It can be useful if you want to detect a large various of different bots (more than 1000).