how to tell if a web request is coming from google

2楼-- · 2019-02-24 16:13

You can read the official Verifying Googlebot page.

Quoting the page here:

You can verify that a bot accessing your server really is Googlebot (or another Google user-agent) by using a reverse DNS lookup, verifying that the name is in the googlebot.com domain, and then doing a forward DNS lookup using that googlebot name. This is useful if you're concerned that spammers or other troublemakers are accessing your site while claiming to be Googlebot.

For example:
> host 66.249.66.1
1.66.249.66.in-addr.arpa domain name pointer  crawl-66-249-66-1.googlebot.com.

> host crawl-66-249-66-1.googlebot.com
crawl-66-249-66-1.googlebot.com has address 66.249.66.1
Google doesn't post a public list of IP addresses for webmasters to whitelist. This is because these IP address ranges can change, causing problems for any webmasters who have hard coded them. The best way to identify accesses by Googlebot is to use the user-agent (Googlebot).

0人赞添加讨论(0) 举报

一纸荒年 Trace。

3楼-- · 2019-02-24 16:13

If you're using Apache Webserver, you could have a look at the log file 'log\access.log'.

Then load google's IPs from http://www.iplists.com/nw/google.txt and check whether one of the IPs is contained in your log.

0人赞添加讨论(0) 举报

老娘就宠你

4楼-- · 2019-02-24 16:15

I have captured google crawler request in my asp.net application and here's how the signature of the google crawler looks.

Requesting IP: 66.249.71.113
Client: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

My logs observe many different IPs for google crawler in 66.249.71.* range. All these IPs are geo-located at Mountain View, CA, USA.

A nice solution to check if the request is coming from Google crawler would be to verify the request to contain Googlebot and http://www.google.com/bot.html. As I said there are many IPs observed with the same requesting client, I'd not recommend to check on IPs. And may be that's where Client identity come into the picture. So go for verifying client identity.

Here's a sample code in C#.

    if (Request.UserAgent.ToLower().Contains("googlebot") || 
             Request.UserAgent.ToLower().Contains("google.com/bot.html"))
    {
        //Yes, it's google bot.
    }
    else
    {
        //No, it's something else.
    }

It's important to note that, any Http-client can easily fake this.

0人赞添加讨论(0) 举报

how to tell if a web request is coming from google

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间