escaping html inside comment tags

2019-03-29 07:42发布

问题:

escaping html is fine - it will remove <'s and >'s etc.

ive run into a problem where i am outputting a filename inside a comment tag eg. <!-- ${filename} -->

of course things can be bad if you dont escape, so it becomes: <!-- <c:out value="${filename}"/> -->

the problem is that if the file has "--" in the name, all the html gets screwed, since youre not allowed to have <!-- -- -->.

the standard html escape doesnt escape these dashes, and i was wondering if anyone is familiar with a simple / standard way to escape them.

回答1:

Definition of a HTML comment:

A comment declaration starts with <!, followed by zero or more comments, followed by >. A comment starts and ends with "--", and does not contain any occurrence of "--".

Of course the parsing of a comment is up to the browser.

Nothing strikes me as an obvious solution here, so I'd suggest you str_replace those double dashes out.



回答2:

There is no good way to solve this. You can't just escape them because comments are read in plaintext. You will have to do something like put a space between the hyphens, or use some sort of code for hyphens (like [HYPHEN]).



回答3:

Since it is obvoius that you cannnot directly display the '--'s you can either encode them or use the fn:escapeXml or fn:replace tags for appropriate replacements. JSTL documentation



回答4:

There's no universal working way to escape those characters in html unless the - characters are in multiples of four so if you do -- it wont work in firefox but ---- will work. So it all depends on the browser. For Example, looking at Internet Explorer 8, it is not a problem, those characters are escaped properly. The same goes for Googles Chrome... However Firefox even the latest browser (3.0.4), it doesn't handle escaping of these characters well.



回答5:

You shouldn't be trying to HTML-escape, the contents of comments are not escapable and it's fine to have a bare ‘>’ or ‘&’ inside.

‘--’ is its own, unrelated problem and is not really fixable. If you don't need to recover the exact string, just do a replacement to get rid of them (eg. replace with ‘__’).

If you do need to get a string through completely unmolested to a JavaScript that will be reading the contents of the comment, use a string literal:

<!-- 'my-string' -->

which the script can then read using eval(commentnode.data). (Yes, a valid use for eval() at last!)

Then your escaping problem becomes how to put things in JS string literals, which is fairly easily solvable by escaping the ‘'’ and ‘-’ characters:

<!-- 'Bob\x27s\x2D\x2Dstring' -->

(You should probably also escape ‘<’, ‘&’ and ‘"’, in case you ever want to use the same escaping scheme to put a JS string literal inside a <​script> block or inline handler.)



标签: html escaping