Client Web Browser Behavior When Handling 301 Redi

2019-03-24 07:37发布

问题:

The RFC seems to suggest that the client should permanently cache the response: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html

10.3.2 301 Moved Permanently

The requested resource has been assigned a new permanent URI and any future references to this resource SHOULD use one of the returned URIs. Clients with link editing capabilities ought to automatically re-link references to the Request-URI to one or more of the new references returned by the server, where possible. This response is cacheable unless indicated otherwise.

The new permanent URI SHOULD be given by the Location field in the response. Unless the request method was HEAD, the entity of the response SHOULD contain a short hypertext note with a hyperlink to the new URI(s).

If the 301 status code is received in response to a request other than GET or HEAD, the user agent MUST NOT automatically redirect the request unless it can be confirmed by the user, since this might change the conditions under which the request was issued.

  Note: When automatically redirecting a POST request after
  receiving a 301 status code, some existing HTTP/1.0 user agents
  will erroneously change it into a GET request.

I'm having a hard time finding concrete browser documentation for any major browser that states how they handle these.

I've started digging through the source code of firefox, but quickly got lost.

Is the following scenario true for which (if any) browsers, and is there definitive documentation for either Firefox or IE that states as much?:

First Time Around:

  • 1.1: User enters link to site A, or clicks on a link directed at Site A
  • 1.2: Browser interprets link at Site A, first time, no cache. Sends GET to Site A.
  • 1.2: Site A responds with 301 Redirect to Site B
  • 1.3: Browser sends GET to Site B.

Any Subsequent Times Around:

  • 2.2: User clicks on a link directed at Site A
  • 2.2: Browser sees that, due to a past 301 redirect, Site A should now be Site B.
  • 2.3: Without initiating any request whatsoever at Site A, browser initiates GET at Site B.

  • 回答1:

    I preformed some tests and found some browsers do cache the 301 result:

    Caches 301 result and skips contacting old address in future?
    
      Internet Explorer 7   no
      Firefox 3.0           no
      Chrome 4.0            yes
      Opera 10.01           yes for google.com, no for www.rnhart.net
    

    How I tested

    I used the following two 301 results to test with:

    • google.com returns a 301 to www.google.com
    • www.rnhart.net returns a 301 to rnhart.net

    I started a proxy server on my own computer (Proxomitron Naoko 4.2 with all filters turned off). In each browser, I set the proxy settings to point to my own computer. I cleared the browser's cache, then I visited the old address multiple times and looked in the proxy server's log window to see what requests the browser made.

    The first time the old address is visited, the proxy log shows the old address request, the 301 response, and the new address request. If the old address is visited again, the log either showed the same set of requests (the 301 wasn't cached), or it showed only the new address request (the 301 was cached).

    I tested entering the old address in the address box, accessing the old address from a bookmark, and accessing the old address from a link on a page. Each browser worked the same way no matter how the address was accessed.


    [I found this question while investigating a similar Super User question: Do browsers change URLs of saved bookmarks in response to 301 redirection?]



    回答2:

    You may use this workaround:
    Make 302 redirect for users and 301 only for search engines. On the server side, just check for the user agent. If it is a bot, do a 301 redirect. Otherwise, do 302.

    It is not the "golden way", but it works great