escaping html is fine - it will remove <
's and >
's etc.
ive run into a problem where i am outputting a filename inside a comment tag eg. <!-- ${filename} -->
of course things can be bad if you dont escape, so it becomes:
<!-- <c:out value="${filename}"/> -->
the problem is that if the file has "--" in the name, all the html gets screwed, since youre not allowed to have <!-- -- -->
.
the standard html escape doesnt escape these dashes, and i was wondering if anyone is familiar with a simple / standard way to escape them.
Definition of a HTML comment:
A comment declaration starts with <!, followed by zero or more comments, followed by >. A comment starts and ends with "--", and does not contain any occurrence of "--".
Of course the parsing of a comment is up to the browser.
Nothing strikes me as an obvious solution here, so I'd suggest you str_replace those double dashes out.
There is no good way to solve this. You can't just escape them because comments are read in plaintext. You will have to do something like put a space between the hyphens, or use some sort of code for hyphens (like [HYPHEN]
).
Since it is obvoius that you cannnot directly display the '--'s you can either encode them or use the fn:escapeXml or fn:replace tags for appropriate replacements.
JSTL documentation
There's no universal working way to escape those characters in html unless the - characters are in multiples of four so if you do -- it wont work in firefox but ---- will work. So it all depends on the browser. For Example, looking at Internet Explorer 8, it is not a problem, those characters are escaped properly. The same goes for Googles Chrome... However Firefox even the latest browser (3.0.4), it doesn't handle escaping of these characters well.
You shouldn't be trying to HTML-escape, the contents of comments are not escapable and it's fine to have a bare ‘>’ or ‘&’ inside.
‘--’ is its own, unrelated problem and is not really fixable. If you don't need to recover the exact string, just do a replacement to get rid of them (eg. replace with ‘__’).
If you do need to get a string through completely unmolested to a JavaScript that will be reading the contents of the comment, use a string literal:
<!-- 'my-string' -->
which the script can then read using eval(commentnode.data). (Yes, a valid use for eval() at last!)
Then your escaping problem becomes how to put things in JS string literals, which is fairly easily solvable by escaping the ‘'’ and ‘-’ characters:
<!-- 'Bob\x27s\x2D\x2Dstring' -->
(You should probably also escape ‘<’, ‘&’ and ‘"’, in case you ever want to use the same escaping scheme to put a JS string literal inside a <script> block or inline handler.)