I want to determine more than one language for a document, because it's available in more than one language. If I use:
<meta http-equiv="content-language" content="en,de,fr" />
this is not W3C valid and the validator says I should define it in the root's lang
attribute, but this attribute only supports one language:
<html lang="en">
works, but not
<html lang="de,en,fr">
So where should I define it?
All attributes support only one language, so I believe you should define only one language; the most important language should be set. This can't be done if you have multiple languages on a single document, so here is the info to solve your problem:
The lang and xml:lang attributes do not allow you to assign multiple languages to a single document. So if you're writing a Web page with multiple languages you have two options:
- Define a primary language with the
lang
attribute, and then call out the secondary language(s) with lang
attributes on elements in the document
Define lang
in the specific sections of the document as needed:
<div lang="fr-CA" xml:lang="fr-CA">
Canadian French content...
</div>
<div lang="en-CA" xml:lang="en-CA">
Canadian English content...
</div>
<div lang="nl-NL" xml:lang="nl-NL">
Netherlands, Dutch content...
</div>
I have some multiple-language pages and I do use the 2nd option.
You might want to read http://www.w3.org/TR/2007/NOTE-i18n-html-tech-lang-20070412/#ri20060630.133619987
The meaning of the Content-Language HTTP header, and hence its meta
tag surrogate, is that it declares the languages of the document, or the languages of the intended audience (the relevant RFCs are contradicting), not the languages of some other documents (like translations of the current document). The practical effect of header is small, probably limited to using the first language named as the language of the document, if there is no language information in HTML markup.
To indicate that a document is available in other languages, you can use tags like
<link rel="alternate" hreflang="de" href="foobar.de.html">
See 12.3.3 Links and search engines in HTML 4.01 spec.
There is no guarantee that this will have any effect. It might affect search engines, but not more than a normal link would do. Some old browser versions had commands for selecting alternate versions of a document, based on elements like this, but the feature seems to have been dropped.
What HTML version do you use? In HTML 4.01, your use of Content-Language
with multiple languages is valid. In HTML5, it's not.
But even for HTML 4.01, the use of Content-Language
for the meta
element is not recommended: HTTP headers, meta elements and language information (W3C)
You can't use it like this.
You'll have to either use an encoding that encompasses all desired chars (e.g UTF-8) which supports the entire Unicode range), or else use named entities or numeric references to include characters outside the encoding in use.
http://bytes.com/topic/html-css/answers/154652-multiple-languages-one-document
UPDATE
If using HTML5 then you can use lang for each element. That means if you have a div that contains Mandarin Chinese in it, just define an attribute lang="zh-CN" for that div, like . ( What is the HTML5 alternative to the obsolete meta http-equiv=content-language. )
As the other posters and the W3C have pointed out, you cannot specify more than one language in the lang
attribute of the html
tag.
However, as shown in this answer to "What attribute value should I use for a mixed language page?", you can markup different parts of a page with elements such as div
and span
tags to indicate different languages (or references to other languages) used on the page.
Also, you can create metadata that describes multiple languages for the intended audience of a page, rather than the language of a specific range of text. You do so by getting the server to send the information in the HTTP Content-Language
header. If your intended audience speaks more than one language, the HTTP header allows you to use a comma-separated list of languages.
Here is an example of an HTTP header that declares the resource to be a mixture of English, Hindi and Punjabi from the W3C's article Declaring language in HTML:
Content-Language: en, hi, pa
Please Note: since you should always use a language attribute on the html
tag, and the language attribute always overrides the HTTP header information, this really becomes a fine point. The HTTP header should be used only to provide metadata about the intended audience of the document as a whole, and the language attribute on the html
tag should be used to declare the default language of the content.
For details on this last technique, see HTTP headers, meta
elements and language information. For general language declarations and markup, see Declaring language in HTML.