The SVG spec refers the reader to the XML Base spec for the characters that may appear in the value of an id
attribute.
The XML Base spec, however, does not spell out these characters, AFAICT. Instead, it makes its recommendations in terms of "Unicode properties" ID_Start
and ID_Continue
.
I am looking for a table (or tables) listing explicitly those characters that have the ID_Start
and/or ID_Continue
properties.
(In case that different applications or XML-based standards may specify their own sets of characters with the ID_Start
and/or the ID_Continue
properties, I am interested in HTML5-embeded SVG.)
I found a repo on Github that generates bunch of different tables using Python scripts based on Unicode standard. For example here are tables for ID_START
, ID_CONTINUE
, XID_START
, XID_CONTINUE
etc: https://github.com/sourtin/libucd/blob/master/src/tables/bool.rs
Edit: I think they are parsed from XML databases provided in: http://www.unicode.org/Public/5.2.0/ucdxml/
Seems that the allowed character range is defined :
An attribute value is:
AttValue ::= '"' ([^<&"] | Reference)* '"'
| "'" ([^<&'] | Reference)* "'"
http://www.w3.org/TR/2008/REC-xml-20081126/#NT-AttValue
A Reference is:
Reference ::= EntityRef | CharRef
http://www.w3.org/TR/2008/REC-xml-20081126/#NT-Reference
CharRef bring us to Char here:
Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */
http://www.w3.org/TR/2008/REC-xml-20081126/#NT-Char