HTTP Status code for language redirect

2019-02-01 22:18发布

问题:

I wonder which HTTP Status code I should have to send in language redirects.

I have the following php code to redirect via HTTP headers to most important language in Accept-Language browser header.

<?
$langs = array();

if (isset($_SERVER['HTTP_ACCEPT_LANGUAGE'])) {
    // break up string into pieces (languages and q factors)
    preg_match_all('/([a-z]{1,8}(-[a-z]{1,8})?)\s*(;\s*q\s*=\s*(1|0\.[0-9]+))?/i', $_SERVER['HTTP_ACCEPT_LANGUAGE'], $lang_parse);

    if (count($lang_parse[1])) {
        // create a list like "en" => 0.8
        $langs = array_combine($lang_parse[1], $lang_parse[4]);

        // set default to 1 for any without q factor
        foreach ($langs as $lang => $val) {
            if ($val === '') $langs[$lang] = 1;
        }

        // sort list based on value 
        arsort($langs, SORT_NUMERIC);
    }
}

// look through sorted list and use first one that matches our languages
foreach ($langs as $lang => $val) {
    if (strpos($lang, 'ca')===0) {
    header("location: ca/");
    exit;
    } else if (strpos($lang, 'es')===0) {
    header("location: es/");
    exit;
    } 
  echo "$lang => $val<br>";
}
// show default site or prompt for language
header("location: en/");

?>

Related question: HTTP status for functional redirect

Maybe 300, 301, 302, 303? Why?

EDIT

Google recently published this: http://googlewebmastercentral.blogspot.com/2011/12/new-markup-for-multilingual-content.html

I found this:

HTTP STATUS 300 Multiple Choices

The requested resource corresponds to any one of a set of representations, each with its own specific location, and agent- driven negotiation information (section 12) is being provided so that the user (or user agent) can select a preferred representation and redirect its request to that location.

Unless it was a HEAD request, the response SHOULD include an entity containing a list of resource characteristics and location(s) from which the user or user agent can choose the one most appropriate. The entity format is specified by the media type given in the Content- Type header field. Depending upon the format and the capabilities of

the user agent, selection of the most appropriate choice MAY be performed automatically. However, this specification does not define any standard for such automatic selection.

If the server has a preferred choice of representation, it SHOULD include the specific URI for that representation in the Location field; user agents MAY use the Location field value for automatic redirection. This response is cacheable unless indicated otherwise.

And this:

HTTP Error 300 - Multiple choices

Introduction

Your Web server thinks that the URL provided by the client (e.g. your Web browser or our CheckUpDown robot) is not specific enough, and a further selection needs to be made from a number of choices.

This is typically the case where the URL represents a high level grouping of which lower level selections need to be made e.g. a directory within which the user must select a particular file to access.

300 errors in the HTTP cycle

Any client (e.g. your Web browser or our CheckUpDown robot) goes through the following cycle when it communicates with the Web server:

Obtain an IP address from the IP name of the site (the site URL without the leading 'http://'). This lookup (conversion of IP name to IP address) is provided by domain name servers (DNSs). Open an IP socket connection to that IP address. Write an HTTP data stream through that socket. Receive an HTTP data stream back from the Web server in response. This data stream contains status codes whose values are determined by the HTTP protocol. Parse this data stream for status codes and other useful information. This error occurs in the final step above when the client receives an HTTP status code that it recognises as '300'.

Fixing 300 errors - general

The first thing you should do is check your URL in a Web browser. If you see some kind of Web page prompting you for further action/choices, then your URL as it stands is not detailed enough for the Web server to process.

Fixing 300 errors - CheckUpDown

You should never see this error on your CheckUpDown account if you gave us a top-level URL (such as www.isp.com) to check. If it does occur for a top-level URL, it is highly likely that the Web server software has been incorrectly programmed or configured. If you have given us a low-level URL (such as www.isp.com/products/index.html) to check, then it is likely that this URL is not accessible even via a Web browser.

The first thing you should do is check your URL in a Web browser. If you see a sensible Web page, then it may indicate a defect in our software. If however you see some kind of Web page prompting you for further action/choices, then your URL is not suitable for us to check, because our system can not possibly make this kind of choice.

Please contact us directly (email preferred) whenever you encounter 300 errors. Only we can resolve them for you. If there is a defect in our software we will fix it. If however your URL is fundamentally unsuitable for us to use, you need to change it on your CheckUpDown account (start by clicking the 'Manage' button).

回答1:

Google uses 302 Found for redirection to localised page.

I think it is safe if Google uses it...

However, it's always good to check what selected response should do and what it is intended for and does it affect caching:

http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html



回答2:

You could serve every language under the same url and then use content-negotiation of the Accept-Language header, but I wouldn't recommend that.

I would rather suggest that on your web sites root url, you issue a redirect (303 - See Other) to a language sub page (E.g. /en). When you do that, respond with a Vary header, that specifies Accept-Language (And any other relevant headers, such as Cookie). That way, any intermediaries (proxies, caches) will be able to cache the response. I would specifically not issue a 301, since you still want links to point to the root url. On the language-specific page, I would put a rel="canonical" to the root url.

See also these threads:

  • How does Google treat HTTP response 303?
  • Canonical URLs and Content Negotiation


回答3:

Possibly HTTP 300 "Multiple Choices" as it's technically the same data/document but available in multiple languages?



回答4:

I think the questions is more related what you want to achieve:

1: Your index page should be the landing page for your visitor, and you want that page the be indexed by search engines.

Pros: You have one entry page for all your visitors that can host additional information before the actual landing page. However, it will not have content for a specific language.

Cons: You don't have any content pages for all languages on search engines.

2: The actual translated page should be the landing page, and if possible your visitors should end up at the translated page directly if that is possible. The redirect page is only for visitors that ended up straight at your site by entering hostname in the addressbar.

Pros: You have multiple "landing pages" for each individual language, which helps scoring and clickthrough.

Cons: You don't have a generic landing page.

There are more pros and cons on these two choices, but I can't think of it right now.

If option 1: use a 302 because you still want it to be part of search index. if option 2: use a 301 because you don't want that page to be indexed. Alternatively, use a noindex on the language select page.

Afaik, Google only takes into account, 301, 302 and 307 (temporary maintenance), and I think it consider everything else as 302 (seems most logical). As far as the browser goes, I think it doesn't matter. It might affect caching, but I think nowadays they are pretty aggressive in caching even 3xx responses.



回答5:

HTTP 303, because it have the most suitable formulation - See Other (302-Moved Temporarily and 301 - Moved Permanently). Actually HTTP 303 response in this situation to ensure that the web user's browser can then safely refresh the server response without causing the initial HTTP POST request to be resubmitted.