Why serve 1x1 pixel GIF (web bugs) data at all?

2019-01-20 22:10发布

Many analytic and tracking tools are requesting 1x1 GIF image (web bug, invisible for the user) for cross-domain event storing/processing.

Why to serve this GIF image at all? Wouldn't it be more efficient to simply return some error code such as 503 Service Temporary Unavailable or empty file?

Update: To be more clear, I'm asking why to serve GIF image data when all information required has been already sent in request headers. The GIF image itself does not return any useful information.

7条回答
Emotional °昔
2楼-- · 2019-01-20 22:24

First, i disagree with the two previous answers--neither engages the question.

The one-pixel image solves an intrinsic problem for web-based analytics apps (like Google Analytics) when working in the HTTP Protocol--how to transfer (web metrics) data from the client to the server.

The simplest of the methods described by the Protocol, the simplest (at lest the simplest method that includes a request body) is the GET request. According to this Protocol method, clients initiate requests to servers for resources; servers process those requests and return appropriate responses.

For a web-based analytics app, like GA, this uni-directional scheme is bad news, because it doesn't appear to allow a server to retrieve data from a client on demand--again, all servers can do is supply resources not request them.

So what's the solution to the problem of getting data from the client back to the server? Within the HTTP context there are other Protocol methods other than GET (e.g., POST) but that's a limited option for many reasons (as evidenced by its infrequent and specialized use such as submitting form data).

If you look at a GET Request from a browser, you'll see it is comprised of a Request URL and Request Headers (e.g., Referer and User-Agent Headers), the latter contains information about the client--e.g., browser type and version, browser langauge, operating system, etc.

Again, this is part of the Request that the client sends to the server. So the idea that motivates the one-pixel gif is for the client to send the web metrics data to the server, wrapped inside a Request Header.

But then how to get the client to Request a resource so it can be "tricked" into sending the metrics data? And how to get the client to send the actual data the server wants?

Google Analytics is a good example: the ga.js file (the large file whose download to the client is triggered by a small script in the web page) includes a few lines of code that directs the client to request a particular resource from a particular server (the GA server) and to send certain data wrapped in the Request Header.

But since the purpose of this Request is not to actually get a resource but to send data to the server, this resource should be a small as possible and it should not be visible when rendered in the web page--hence, the 1 x 1 pixel transparent gif. The size is the smallest size possible, and the format (gif) is the smallest among the image formats.

More precisely, all GA data--every single item--is assembled and packed into the Request URL's query string (everything after the '?'). But in order for that data to go from the client (where it is created) to the GA server (where it is logged and aggregated) there must be an HTTP Request, so the ga.js (google analytics script that's downloaded, unless it's cached, by the client, as a result of a function called when the page loads) directs the client to assemble all of the analytics data--e.g., cookies, location bar, request headers, etc.--concatenate it into a single string and append it as a query string to a URL (*http://www.google-analytics.com/__utm.gif*?) and that becomes the Request URL.

It's easy to prove this using any web browser that has allows you to view the HTTP Request for the web page displayed in your browser (e.g., Safari's Web Inspector, Firefox/Chrome Firebug, etc.).

For instance, i typed in valid url to a corporate home page into my browser's location bar, which returned that home page and displayed it in my browser (i could have chosen any web site/page that uses one of the major analytics apps, GA, Omniture, Coremetrics, etc.)

The browser i used was Safari, so i clicked Develop in the menu bar then Show Web Inspector. On the top row of the Web Inspector, click Resources, find and click the utm.gif resource from the list of resources shown on the left-hand column, then click the Headers tab. That will show you something like this:

Request URL:http://www.google-analytics.com/__utm.gif?
           utmwv=1&utmn=1520570865&
           utmcs=UTF-8&
           utmsr=1280x800&
           utmsc=24-bit&
           utmul=enus&
           utmje=1&
           utmfl=10.3%20r181&

Request Method:GET
Status Code:200 OK

Request Headers
    User-Agent:Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; en-us) AppleWebKit/533.21.1 
                 (KHTML, like Gecko) Version/5.0.5 Safari/533.21.1

Response Headers
    Cache-Control:private, no-cache, no-cache=Set-Cookie, proxy-revalidate
    Content-Length:35
    Content-Type:image/gif
    Date:Wed, 06 Jul 2011 21:31:28 GMT

The key points to notice are:

  1. The Request was in fact a request for the utm.gif, as evidenced by the first line above: *Request URL:http://www.google-analytics.com/__utm.gif*.

  2. The Google Analytics parameters are clearly visible in the query string appended to the Request URL: e.g., utmsr is GA's variable name to refer to the client screen resolution, for me, shows a value of 1280x800; utmfl is the variable name for flash version, which has a value of 10.3, etc.

  3. The Response Header called Content-Type (sent by the server back to the client) also confirms that the resource requested and returned was a 1x1 pixel gif: Content-Type:image/gif

This general scheme for transferring data between a client and a server has been around forever; there could very well be a better way of doing this, but it's the only way i know of (that satisfies the constraints imposed by a hosted analytics service).

查看更多
女痞
3楼-- · 2019-01-20 22:26

You don't have to serve an image if you are using the Beacon API (https://w3c.github.io/beacon/) implementation method.

An error code would work if you have access to the log files of your server. The purpose of serving the image is to obtain more data about the user than you normally would with a log file.

查看更多
霸刀☆藐视天下
4楼-- · 2019-01-20 22:28

Because such a GIF has a known presentation in a browser - it's a single pixel, period. Anything else presents a risk of visually interfering with the actual content of the page.

HTTP errors could appear as oversized frames of error text or even as a pop-up window. Some browsers may also complain if they receive empty replies.

In addition, in-page images are one of the very few data types allowed by default in all broswers. Anything else may require explicit user action to be downloaded.

查看更多
聊天终结者
5楼-- · 2019-01-20 22:31

This is to answer the OP's question - "why to serve GIF image data..."

Some users will put a simple img tag to call your event logging service -

<img src="http://www.example.com/logger?event_id=1234">

In this case, if you don't serve an image, the browser will show a placeholder icon that will look ugly and give the impression that your service is broken!

What I do is, look for the Accept header field. When your script is called via an img tag like this, you will see something like following in the header of the request -

Accept: image/gif, image/*
Accept-Encoding:gzip,deflate
...

When there is "image/"* string in the Accept header field, I supply the image, otherwise I just reply with 204.

查看更多
Evening l夕情丶
6楼-- · 2019-01-20 22:32

Doug's answer is pretty comprehensive; I thought I'd add in an additional note (at the OP's request, off of my comment)

Doug's answer explains why 1x1 pixel beacons are used for the purpose they are used for; I thought I'd outline a potential alternative approach, which is to use HTTP Status Code 204, No Content, for a response, and not send an image body.

204 No Content

The server has fulfilled the request but does not need to return an entity-body, and might want to return updated metainformation. The response MAY include new or updated metainformation in the form of entity-headers, which if present SHOULD be associated with the requested variant.

Basically, the server receives the request, and decides to not send a body (in this case, to not send an image). But it replies with a code to inform the agent that this was a conscious decision; basically, its just a shorter way to respond affirmatively.

From Google's Page Speed documentation:

One popular way of recording page views in an asynchronous fashion is to include a JavaScript snippet at the bottom of the target page (or as an onload event handler), that notifies a logging server when a user loads the page. The most common way of doing this is to construct a request to the server for a "beacon", and encode all the data of interest as parameters in the URL for the beacon resource. To keep the HTTP response very small, a transparent 1x1-pixel image is a good candidate for a beacon request. A slightly more optimal beacon would use an HTTP 204 response ("no content") which is marginally smaller than a 1x1 GIF.

I've never tried it, but in theory it should serve the same purpose without requiring the gif itself to be transmitted, saving you 35 bytes, in the case of Google Analytics. (In the scheme of things, unless you're Google Analytics serving many trillions of hits per day, 35 bytes is really nothing.)

You can test it with this code:

var i = new Image(); 
i.src = "http://httpstat.us/204";
查看更多
孤傲高冷的网名
7楼-- · 2019-01-20 22:41

Well the major reason is to attach the cookie to it so if users go from one side to another we still have the same element to attach cookie to.

查看更多
登录 后发表回答