I'm having trouble formulating HTTP cache headers for the following situation.
Our server has large data that changes perhaps a couple times a week. I want browsers to cache this data. Additionally, I want to minimize latency from conditional gets as the network is unreliable.
The end behavior I'm after is this:
- Client requests a resource it hasn't seen before.
- Server responds with resource along with ETag and
max-age
(24 hours).
- Until 24 hours has passed, client will use cached resource.
- After the expiration date, client will perform a validate request (
If-None-Match: [etag]
)
- If resource has not changed:
- server responds with
304 Not Modified
- client is somehow informed that the existing resource has a new expiration date 24 hours from now
- return to step 3
Boiled down to its essense... can a 304 response contain a new max-age
? Or is the original max-age
honored for subsequent requests?
Yes, a 304 response can contain a new max-age
(or ETag, or other response headers for that matter).
I did an experiment using Firefox 4 to test whether the original max-age or the new one is honored, and the answer was that the new max-age
is honored, so you should be able to implement what you want to do.
It's important to remember that max-age
is relative to the Date
response header, not Last-Modified
, so whenever your server sets a max-age
directive of 24 hours, it is saying "24 hours from right now." So, assuming that's what you want, you won't have to change your max-age
at all, just always return 86400.
Anyway, here's an overview and dump of my experiment. Basically, I hit a test URL that set an ETag and set max-age
to 120 seconds. Accordingly, the server returned the page with these response headers:
HTTP/1.1 200 OK
Date: Tue, 14 Jun 2011 23:48:51 GMT
Cache-Control: max-age=120
Etag: "901ea3d0ac9303ae4855a09676f96701"
Last-Modified: Mon, 13 Jun 2011 22:20:03 GMT
I then repeated hitting "enter" in the address bar to load the page (but not force a hard reload). There was no network traffic, since Firefox repeatedly reloaded the page from cache. Then, after 120 seconds were over, the very next time I hit enter, Firefox instead sent a conditional GET to the server, as you would expect. The request and response from the server were:
GET /example HTTP/1.1
If-Modified-Since: Mon, 13 Jun 2011 22:20:03 GMT
If-None-Match: "901ea3d0ac9303ae4855a09676f96701"
HTTP/1.1 304 Not Modified
Date: Tue, 14 Jun 2011 23:50:54 GMT
Etag: "901ea3d0ac9303ae4855a09676f96701"
Cache-Control: max-age=240
Note that in the 304 response, I've had the server change max-age
from 120 seconds to 240.
So, the big question is, what would happen after 120 seconds? Would Firefox respect the new max-age
and continue loading the page from cache, or would it hit the server? The answer is that it continued loading the page from cache, and did not re-request until after 240 seconds were reached:
GET /example HTTP/1.1
If-Modified-Since: Mon, 13 Jun 2011 22:20:03 GMT
If-None-Match: "901ea3d0ac9303ae4855a09676f96701"
HTTP/1.1 304 Not Modified
Date: Tue, 14 Jun 2011 23:54:56 GMT
Etag: "901ea3d0ac9303ae4855a09676f96701"
Cache-Control: max-age=240
I repeated through another 240-second cycle and things worked as you'd expect. So, hopefully that answers the question for you.
The RFC explains how age computations are supposed to be implemented, and how the other Cache-Control parameters work. There's no guarantee that every browser and proxy will follow the rules, but at this point HTTP 1.1 is pretty old and you'd expect most of them will do as Firefox does.
(Note: For brevity in these example dumps, I've deleted irrelevant headers such as host, connection/keep-alive, content encoding/length/type, user-agent etc.)