XML comments and “--”

2019-03-22 09:28发布

问题:

<!-- here is some comment --
                            ^
                            |
                    what can be here apart from '>'?

XML seems not to like '--' inside comments. I read somewhere that '--' switchs some modes inside <! ... > thing, but <!-- -- -- --> (even number of --s) seem to be invalid too. If it is some historic feature, what is "pro" part of it? ("contra" part is inability to have -- in comments).

What is the reason of complicating comment processing by not making just '-->' end of comment and allowing '--' inside?

回答1:

From the standards document:

http://www.w3.org/TR/REC-xml/#sec-comments

[Definition: Comments may appear anywhere in a document outside other markup; in addition, they may appear within the document type declaration at places allowed by the grammar. They are not part of the document's character data; an XML processor may, but need not, make it possible for an application to retrieve the text of comments. For compatibility, the string " -- " (double-hyphen) must not occur within comments.] Parameter entity references must not be recognized within comments.



回答2:

Maybe it can be helpful for someone. I had a problem, that I wanted to comment out a command line parameter in XML that starts with --:

<arg line="-v --line-break 0" />  

so naturally normal way like this

<!-- <arg line="-v --line-break 0" /> -->

didn't work, but I found out, that if the - is replaced by it's UTF-8 equivalent &#x002D; it works and can be tolerated inside comments.

So in my case the string

<arg line="-v &#x002d;&#x002d;line-break 0" />

is parsed correctly and can be part of comments.

Of course it looks a little ugly, but if someone want to keep a string with -- as comment in his XML - I think it's still better than nothing.



回答3:

It's one of those stupid rules that's in XML because it was in SGML and people didn't want to break compatibility. Why it's in SGML is anyone's guess: probably because it saved three lines of code in the original parser.



回答4:

-- is not allowed for compatibility with SGML. From On SGML and HTML:

White space is not permitted between the markup declaration open delimiter("<!") and the comment open delimiter ("--"), but is permitted between the comment close delimiter ("--") and the markup declaration close delimiter (">"). A common error is to include a string of hyphens ("---") within a comment. Authors should avoid putting two or more adjacent hyphens inside comments.

So in SGML <! and > open and close "markup declarations" and -- opens and closes comments.



标签: xml comments