Google.com and clients1.google.com/generate_204

2019-01-21 06:23发布

站内文章 / JavaScript

11 0

太酷不给撩

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I was looking into google.com's Net activity in firebug just because I was curious and noticed a request was returning "204 No Content."

It turns out that a 204 No Content "is primarily intended to allow input for actions to take place without causing a change to the user agent's active document view, although any new or updated metainformation SHOULD be applied to the document currently in the user agent's active view." Whatever.

I've looked into the JS source code and saw that "generate_204" is requested like this:

(new Image).src="http://clients1.google.com/generate_204"

No variable declaration/assignment at all.

My first idea is that it was being used to track if Javascript is enabled. But the "(new Image).src='...'" call is called from a dynamically loaded external JS file anyway, so that would be pointless.

Anyone have any ideas as to what the point could be?

UPDATE

"/generate_204" appears to be available on many google services/servers (e.g., maps.google.com/generate_204, maps.gstatic.com/generate_204, etc...).

You can take advantage of this by pre-fetching the generate_204 pages for each google-owned service your web app may use. Like This:

window.onload = function(){
    var two_o_fours = [
        // google maps domain ...
        "http://maps.google.com/generate_204",

        // google maps images domains ... 
        "http://mt0.google.com/generate_204",
        "http://mt1.google.com/generate_204",
        "http://mt2.google.com/generate_204",
        "http://mt3.google.com/generate_204",

        // you can add your own 204 page for your subdomains too!
        "http://sub.domain.com/generate_204"
    ];
    for(var i = 0, l = two_o_fours.length; i < l; ++i){
        (new Image).src = two_o_fours[i];
    }
};

回答1:

Like Snukker said, clients1.google.com is where the search suggestions come from. My guess is that they make a request to force clients1.google.com into your DNS cache before you need it, so you will have less latency on the first "real" request.

Google Chrome already does that for any links on a page, and (I think) when you type an address in the location bar. This seems like a way to get all browsers to do the same thing.

回答2:

I found this old Thread while google'ing for generate_204 as Android seems to use this to determine if the wlan is open (response 204 is received) closed (no response at all) or blocked (redirect to captive portal is present). In that case a notification is shown that a log-in to WiFi is required...

回答3:

In the event that Chrome detects SSL connection timeouts, certificate errors, or other network issues that might be caused by a captive portal (a hotel's WiFi network, for instance), Chrome will make a cookieless request to http://www.gstatic.com/generate_204 and check the response code. If that request is redirected, Chrome will open the redirect target in a new tab on the assumption that it's a login page. Requests to the captive portal detection page are not logged.

Font: Google Chrome Privacy Whitepaper

回答4:

Google is using this to detect whether the device is online or in captive portal.

Shill, the connection manager for Chromium OS, attempts to detect services that are within a captive portal whenever a service transitions to the ready state. This determination of being in a captive portal or being online is done by attempting to retrieve the webpage http://clients3.google.com/generate_204. This well known URL is known to return an empty page with an HTTP status 204. If for any reason the web page is not returned, or an HTTP response other than 204 is received, then shill marks the service as being in the portal state.

Here is the relevant explanation from the Google Chrome Privacy Whitepaper:

In the event that Chrome detects SSL connection timeouts, certificate errors, or other network issues that might be caused by a captive portal (a hotel's WiFi network, for instance), Chrome will make a cookieless request to http://www.gstatic.com/generate_204 and check the response code. If that request is redirected, Chrome will open the redirect target in a new tab on the assumption that it's a login page. Requests to the captive portal detection page are not logged.

More info: http://www.chromium.org/chromium-os/chromiumos-design-docs/network-portal-detection

回答5:

204 responses are sometimes used in AJAX to track clicks and page activity. In this case, the only information being passed to the server in the get request is a cookie and not specific information in request parameters, so this doesn't seem to be the case here.

It seems that clients1.google.com is the server behind google search suggestions. When you visit http://www.google.com, the cookie is passed to http://clients1.google.com/generate_204. Perhaps this is to start up some kind of session on the server? Whatever the use, I doubt it's a very standard use.

回答6:

with the massive remit by google to stop both spam and the scraping of their search database, I believe that this is part of the effort to track bots etc.

some simple anti bot pseudo could go like this.

On GET (google.*) Save RemoteEndPoint
{
    If RemoteEndPoint GETs (clients1.google.com/generate_204) Then
        Set botAlert_stage1 = false;
    Else
        Set botAlert_stage1 = true;
    End If
}

I also believe that the latest google frontpage 'theme' is also a new tool to help with the anti spam/bot activity.

** NOTE ** ipv6.google.com also includes this measure.

Just my unfounded unproofed two 2p.

回答7:

This documents explains:

http://docs.lib.purdue.edu/cgi/viewcontent.cgi?article=1417&context=ecetr&sei-redir=1

(Search for generate204)

Relevant section:

Among the different objects, a javascript function triggers a generate204 request sent to the video server that is supposed to serve the video. This starts the video prefetch, which has two main goals: first, it forces the client to perform the DNS resolution of the video server name. Second, it forces the client to open a TCP connection toward the video server. Both help to speed-up the video download phase.

In addition, the generate204 request has exactly the same format and options of the real video download request, so that the video server is eventually warned that a client will possibly download that video very soon. Note that the video server replies with a 204 No Content response, as implied by the command, and no video content is downloaded so far.

回答8:

I found this blog post which explains that it's used to record clicks. Without official word from Google it could be used any number of things.

http://mark.koli.ch/2009/03/howto-configure-apache-to-return-a-http-204-no-content-for-ajax.html

回答9:

Many applications access this URL to determine if they have a connection that only leads to a captive portal.

The idea is that any captive portal thinks this is a "normal" website, and then redirects you to its portal site, which is returned with a status 200. If an application tries to access any normal website, it is confronted with a totally unexpected response and may have problems figuring out what's wrong. However, with this URL it's easy: If you get status 200, you are inside a captive portal, and you can tell your user to do something about it (usually either log in to the portal using a browser, or turn WiFi off and rely on 3G, if they are using a phone). If you get status 204, you got connected to Google, so your application is actually connected to the internet.

Microsoft and Apple use a slightly different approach; they both have some URLs that return a very specific short text message with a status 200, so instead of accessing the Google url you can for example go to "captive.apple.com" and check for status 200 with data = "Success" and nothing else. If you get status 200 and not exactly that data then you are again in a captive portal.

回答10:

The generate 204 might be dynamically loading the suggestions of search criteria. AS i can see from my load test script, this is seemingly responsible for every server call each time the user types into the text box

回答11:

Well i have been looking at this for a few times and resulted that Google logs referer's where they come from first time visiting the google.com for ex; tracking with Google Chrome i have a 90% guess that its for Logging Referers, maybe User-Agent statistics well known when Google release its list of standards of browser usage:

Request URL: http://clients1.google.se/generate_204
Request Method: GET
Status Code: 204 No Content

Response Headers

Content-Length: 0
Content-Type: text/html
Date: Fri, 21 May 2010 17:06:24 GMT
Server: GFE/2.0

Here "Referer" under "^Request Headers" shows Googles statistics that many folks come from Microsoft.com, also parsing out the word "Windows 7" to help me focus on Windows 7 in my up-following searches that session

//Steven