Stop browser to make HTTP requests for images that

2019-01-12 23:51发布

After reading many articles and some questions on here, I finally succeded in activating the Apache mod_expires to tell the browser it MUST cache images for 1 year.

<filesMatch "\.(ico|gif|jpg|png)$">
  ExpiresActive On
  ExpiresDefault "access plus 1 year"
  Header append Cache-Control "public"
</filesMatch>

And thankfully server responses seem to be correct:

HTTP/1.1 200 OK 
Date: Fri, 06 Apr 2012 19:25:30 GMT 
Server: Apache 
Last-Modified: Tue, 26 Jul 2011 18:50:14 GMT 
Accept-Ranges: bytes 
Content-Length: 24884 
Cache-Control: max-age=31536000, public 
Expires: Sat, 06 Apr 2013 19:25:30 GMT
Connection: close
Content-Type: image/jpeg 

Well, I thought this would stop the browser to download and even inquire the server about the images for 1 year. But it's partially true: cause if you close and reopen the browser, the browser does NOT download the images from server anymore, but browser still inquires the server with an HTTP request for each image.

How do I force browser to stop making HTTP requests for each image? Even if these HTTP requests are not followed by an image being downloaded, they are still requests made to the server that unecessarely icrease latency and slow down the page rendering!

I already told the browser it MUST keep the images in cache for 1 year! Why does browser still inquire the server for each image (even if it does not download the image)?!


Looking at network graphs in FireBug (menu FireBug > Net > Images) I can see different caching behaviours (I obviously started with the browser cache completly empty, I forced a cache delete on browser using "Clear All History"):

  • When the page is loaded for the 1st time all images are downloaded (and same thing happens if I force a page reload by clicking on the browser's reload page button). This makes sense!

  • When I navigate the site and get back to the same page the images are not downloaded at all and the browser does NOT even inquire the server for any of the images. This makes sense, (and I would like to see this behaviour also when browser is closed)!

  • When I close the browser and open it again on the same page, the silly browser makes anyway HTTP request to the server one time per image: it does NOT downalod the image, but it still makes an HTTP request, it's like the browser inquires the server about the image (server replies with 200 OK). This is the one that irritates me!

I also attach the graphs below if you are interested:

enter image description here

enter image description here

EDIT: just tested now also with FireFox 11.0 just to make sure it wasn't an issue of my FireFox 3.6 being too old. The same thing happens!!! I also tested Google site and Stackoverflow site, they do both send the Cache-Control: max-age=... but the browser still makes an HTTP request to the server for each image once the browser is closed and opened again on the same page, after server response the browser does NOT download the image (as I explained above) but it still makes the damn request that increases time to see page.

EDIT2: and removing the Last-Modified header as suggested here, does not solve the problem, it does not make any difference.

10条回答
Luminary・发光体
2楼-- · 2019-01-13 00:24

This question has a better answer here at webmasters stack-exchange site.

More information, which is also cited in the above link, is on httpwatch

According to the article:

There are a number of situations in which Internet Explorer needs to check whether a cached entry is valid:

  • The cached entry has no expiration date and the content is being accessed for the first time in a browser session
  • The cached entry has an expiration date but it has expired
  • The user has requested a page update by clicking the Refresh button or pressing F5

    enter code here

查看更多
时光不老,我们不散
3楼-- · 2019-01-13 00:25

What you are describing here does not reflect my experience. If content is served with a no-store directive or you do an explicit refresh, then yes, I'd expect it to go back to the origin server otherwise it should be cached across browser restarts (assuming it is allowed to, and can write a cache file).

Looking at your waterfalls in a bit more detail (which is tricky because they are a bit small & blurry) the browser appears to be doing exactly what it should - it has entries for the images - but these are just loading from the local cache not from the origin server - check the 'Date' header in the response (why do you think it's taking milliseconds instead of seconds?). That's why they are coloured differently.

查看更多
劳资没心,怎么记你
4楼-- · 2019-01-13 00:31

After myself spending considerable time looking for a reasonable answer, I found the below link most useful and it does answer the question asked here.

https://webmasters.stackexchange.com/questions/25342/headers-to-prevent-304-if-modified-since-head-requests

查看更多
小情绪 Triste *
5楼-- · 2019-01-13 00:34

The behavior you are seeing is the intended (see RFC7234 for more details), specified behavior:

All modern browsers will send HTTP requests to the server for every page element displayed, regardless of cache status. This was a design decision made at the request of web services (especially advertising networks) to ensure that HTTP servers were able to maintain records of every display of every element.

If the browsers did not make these requests, the server would never be notified that an image had been displayed to the user. For advertising networks, this would be catastrophic. Early on, advertising networks 'hacked' their way around this by serving the same ad image using randomly generated names (ex: 'coke_ad_1_98719283719283.gif'). However, for ISPs this practice caused a huge increase in data transfers, because every one of their users was re-downloading these identical ad images, bypassing any caching/proxy servers their ISP was operating.

So a truce was reached: Browsers would always send HTTP requests, even for un-expired cached elements. Servers would respond with HTTP 304 status codes ("not modified"). This allows the servers to record the fact that the image was displayed to the client. As a result, advertising networks generally stopped using randomized image names to bypass network cache servers.

This gave the ad networks what they wanted - a record of every image displayed - and it gave ISPs what they wanted - cache-able images and static content.

That is why there isn't much you can do to prevent browsers from sending HTTP requests for cached page elements.

But if you look at other available client-side solutions that came along with html5, there is a scope to prevent resource loading

  1. Cache Manifest (in spite of its gotchas)
  2. IndexedDB (nice asynchronous features, allows blob storage)
  3. Local Storage (not async)
查看更多
ら.Afraid
6楼-- · 2019-01-13 00:38

You were using the wrong tool for analysing the requests.

I'd recommend the really useful Firefox addon Live HTTP headers so you can see what is really going on on the network.

And just to be sure, you can ssh/putty your server and do something like

tail -f /var/log/apache2/access.log
查看更多
一纸荒年 Trace。
7楼-- · 2019-01-13 00:39

What you are seeing in Chrome is not a record of the actual HTTP requests - it's a record of asset requests. Chrome does this to show you that an asset is actually being requested by the page. However, this view does not really actually indicate if the request is being made. If an asset is cached, Chrome will never actually create the underlying HTTP request.

You can also confirm this by hovering over the purple segments in the timeline. Cached resources will have a (from cache) in the tooltip.

In order to see the actual HTTP requests, you need to look on a lower level. In some browsers this can be done with a plugin (like Live HTTP Headers).

In reality though, to verify the requests are not actually being made you need to check your server logs or use a debugging proxy like Charles or Fiddler. This will work on an HTTP level to make sure the requests are not actually happening.

查看更多
登录 后发表回答