How to verify that Squid used as a reversed proxy

2020-07-17 16:04发布

问题:

We want to decrease the load in one of our web servers and we are running some tests with squid configured as a reverse proxy.

The configuration is in the remarks below:

http_port 80 accel defaultsite=original.server.com

cache_peer original.server.com parent 80 0 no-query originserver name=myAccel

acl our_sites dstdomain .contentpilot.net

http_access allow our_sites

cache_peer_access myAccel allow our_sites

cache_peer_access myAccel deny all

The situation we are having is that pretty much the server is returning TCP_MISS almost all the time.

1238022316.988     86 69.15.30.186 TCP_MISS/200 797 GET http://original.server.com/templates/site/images/topnav_givingback.gif - FIRST_UP_PARENT/myAccel -
1238022317.016     76 69.15.30.186 TCP_MISS/200 706 GET http://original.server.com/templates/site/images/topnav_diversity.gif - FIRST_UP_PARENT/myAccel -
1238022317.158     75 69.15.30.186 TCP_MISS/200 570 GET http://original.server.com/templates/site/images/topnav_careers.gif - FIRST_UP_PARENT/myAccel -
1238022317.344     75 69.15.30.186 TCP_MISS/200 2981 GET http://original.server.com/templates/site/js/home-search-personalization.js - FIRST_UP_PARENT/myAccel -
1238022317.414     85 69.15.30.186 TCP_MISS/200 400 GET http://original.server.com/templates/site/images/submenu_arrow.gif - FIRST_UP_PARENT/myAccel -
1238022317.807     75 69.15.30.186 TCP_MISS/200 2680 GET http://original.server.com/templates/site/js/homeMakeURL.js - FIRST_UP_PARENT/myAccel -
1238022318.666   1401 69.15.30.186 TCP_MISS/200 103167 GET http://original.server.com/portalresource/lookup/wosid/intelliun-2201-301/image2.jpg - FIRST_UP_PARENT/myAccel image/pjpeg
1238022319.057   1938 69.15.30.186 TCP_MISS/200 108021 GET http://original.server.com/portalresource/lookup/wosid/intelliun-2201-301/image1.jpg - FIRST_UP_PARENT/myAccel image/pjpeg
1238022319.367     83 69.15.30.186 TCP_MISS/200 870 GET http://original.server.com/templates/site/images/home_dots.gif - FIRST_UP_PARENT/myAccel -
1238022319.367     80 69.15.30.186 TCP_MISS/200 5052 GET http://original.server.com/templates/site/images/home_search.jpg - FIRST_UP_PARENT/myAccel -
1238022319.368     88 69.15.30.186 TCP_MISS/200 5144 GET http://original.server.com/templates/site/images/home_continue.jpg - FIRST_UP_PARENT/myAccel -
1238022319.368     76 69.15.30.186 TCP_MISS/200 412 GET http://original.server.com/templates/site/js/showFooterBar.js - FIRST_UP_PARENT/myAccel -
1238022319.377    100 69.15.30.186 TCP_MISS/200 399 GET http://original.server.com/templates/site/images/home_arrow.gif - FIRST_UP_PARENT/myAccel -

We already tried removing all the cache memory. Any ideas. Could it be that my web site is marking some of the content different each time even though it has not change since the last time it was requested by the proxy?

回答1:

What headers is the origin server (web server) sending back with your content? In order to be cacheable by squid, I believe you generally have to specify either a Last-Modified or ETag in the response header. Web servers will typically do this automatically for static content, but if your content is being dynamically served (even if from a static source) then you have to ensure they are there, and handle request headers such as If-Modified-Since and If-None-Match.

Also, since I got pointed to this question by your subsequent question about sessions--- is there a "Vary" header coming out in the response? For example, "Vary: Cookie" tells caches that the content can vary according to the Cookie header in the request: so static content wants to have that removed. But your web server might be adding that to all requests if there is a session, regardless of the static/dynamic nature of the data being served.

In my experience, some experimentation with the HTTP headers to see what the effects are on caching is of great benefit: I remember finding that the solutions were not always obvious.



回答2:

Examine the headers returned with wireshark or firebug in firefox (the latter is easier to prod around but the former will give you more low-level information if you end up needing that).

Look for these items in the Response Headers (click on an item in the `Net' view to expand it and see request and response headers):

  • Last-Modified date -> if not set to a sensible time in the past then it won't be cached
  • Etags -> if these change every time the same item is requested then it will be re-fetched
  • Cache-Control -> Requests from the client with max-age=0 will (I believe) request a fresh copy of the page each time
  • (edit) Expires header -> If this is set in the past (i.e. always expired) then squid will not cache it

As suggested by araqnid, the HTTP headers can make a huge difference to what the proxy will think it can cache. If your back-end is using apache then test that static files served without going via any PHP or other application layer are cacheable.

Also, check that the squid settings for maximum_object_size and minimum_object_size are set to sensible values (the defaults are 4Mb and 0kb, which should be fine), and maximum cache item ages are also set sensibly. (See http://www.visolve.com/squid/squid30/cachesize.php#maximum_object_size for this and other settings)