Why does google pagespeed asks to specify ETag eve

2019-06-13 15:30发布

问题:

I have set cache headers to be far in future (1 year from now) and have disabled the ETags as advised by the YSlow (http://developer.yahoo.com/performance/rules.html#etags) but Google pagespeed seems to require ETag (or last-modified) even after the cached headers are set.

"It is important to specify one of Expires or Cache-Control max-age, and one of Last-Modified or ETag, for all cacheable resources."

The two rules seems to be conflicting each other.

回答1:

YSlow does not advise to remove ETags in general but for some environments. When not using ETags then you should use Last-Modified instead.

ETag and Last-Modified are for conditional GET-Requests when re-requesting an already cached and maybe expired resource.

Cache-Control max-age is for defining how long a cached item is valid for sure without asking again. (When expired by this rule then the browser will make a conditional GET ...)

So in your case:

  • Browser is caching the resource for one year. Within that year no request for this resource is done at all. It's directly served from local cache. (uses Cache-Control header settings.)
  • Browser does conditional Request after one year expired to check if something changed. The server responds with HTTP 304 and empty body when nothing changed. The browser continues to use its cached item in that case without the need of retransmission. (uses ETag and/or Last-Modified header settings)

(The browser may or may not respect your data. For example it is possible that a browser will do a conditional request even when one year has not been expired yet.)

For highly optimized sites the Cache-Control is far more important, because you set it faaaar future expire headers and simply change the URL for the resource in case it changed. While this prevents the use of conditional Requests it gives you the ability to be extremly aggressive when defining the expires header while being able to serve new versions of the resource immediatly to everybody at the same time. This is because of the new URL it seems to be a new resource in browser's view.

For Java there exists a framework called jawr which makes use of these and other concepts without having negative impact to your site development.



回答2:

ETag and Cache-Control headers are not exclusive. The reason the page you linked to recommends to remove ETags is to reduce the size of the HTTP headers.. which will at best save you a few bytes. Here's a use case where and why is still makes sense to have both:

  • You provide application.js with one week expiry date, and an etag fingerprint
  • Week passes, user comes back to your site: the file has expired, and the browser dispatches a conditional request, if the file has not been modified, the browser can decide to skip requesting the file entirely. (Last-Modified works too)

If you don't provide an ETag or Last-Modified, the browser has to request and download the entire file.

Good related resource: https://developers.google.com/speed/articles/caching