According to the W3C XML Recommendation, start tag-names have the definition:
STag ::= '<' Name (S Attribute)* S? '>'
..where Name
is:
Name ::= NameStartChar (NameChar)*
NameStartChar ::= ":" | [A-Z] | ...
..(n.b., states that a colon can appear as the first character) suggesting the following is a valid XML document:
<?xml version="1.0" ?><:doc></:doc>
..but any parser I try this in shows the colon as a formatting error.
Also, under Appendices B (though now a depreciated part of the document) it explicitly states:
Characters ':' and '_' are allowed as name-start characters.
..and:
<?xml version="1.0" ?><_doc></_doc>
..is accepted by the XML parsers I've tried.
So, is a colon a valid first character in a tag-name, and the parsers I'm using are wrong, or am I reading the specification wrong?
Yes, at the base XML level, colon (
:
) is allowed as a name-start character. The BNF rules you cite clearly specify this.However, the W3C XML Recommendation is clear that colons should not be used except for namespaces purposes:
And the XML Namespace BNF rules for tags are based on QName, which allow for colon in a name only as a separated between
Prefix
andLocalPart
:One might ask why colon wasn't disallowed in
NameStartChar
from the beginning. If we're lucky, C. M. Sperberg-McQueen may offer an authoritative explanation. However, I suspect it's a matter of an evolving notion of how namespaces were expected to be designed.The first published working draft in 1996 of the W3C XML Recommendation had a definition of
STag
which did not allow colon:By 1998, colons were allowed in
Name
,and an earlier form of the admonition about colon use read:
The need was anticipated but the precise form perhaps was not yet known when colon was first introduced to tag names.
They are allowed in non-namespace-aware XML but they are not allowed in namespace-aware XML. More specifically, the base XML recommendation allows them but the Namespaces recommendation prohibits them. Very few people nowadays use non-namespace-aware XML (and I'm not sure what parsers support it) so it's best to assume they aren't allowed.