ISO table(s) of valid characters for SVG ids

2019-06-26 08:54发布

问题:

The SVG spec refers the reader to the XML Base spec for the characters that may appear in the value of an id attribute.

The XML Base spec, however, does not spell out these characters, AFAICT. Instead, it makes its recommendations in terms of "Unicode properties" ID_Start and ID_Continue.

I am looking for a table (or tables) listing explicitly those characters that have the ID_Start and/or ID_Continue properties.

(In case that different applications or XML-based standards may specify their own sets of characters with the ID_Start and/or the ID_Continue properties, I am interested in HTML5-embeded SVG.)

回答1:

I found a repo on Github that generates bunch of different tables using Python scripts based on Unicode standard. For example here are tables for ID_START, ID_CONTINUE, XID_START, XID_CONTINUE etc: https://github.com/sourtin/libucd/blob/master/src/tables/bool.rs

Edit: I think they are parsed from XML databases provided in: http://www.unicode.org/Public/5.2.0/ucdxml/



回答2:

Seems that the allowed character range is defined :

An attribute value is:

AttValue   ::= '"' ([^<&"] | Reference)* '"'
               |  "'" ([^<&'] | Reference)* "'"

http://www.w3.org/TR/2008/REC-xml-20081126/#NT-AttValue

A Reference is:

Reference    ::=    EntityRef | CharRef 

http://www.w3.org/TR/2008/REC-xml-20081126/#NT-Reference

CharRef bring us to Char here:

Char   ::=    #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]    /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */

http://www.w3.org/TR/2008/REC-xml-20081126/#NT-Char