I have worked the last 2 days on WMD and Markdown, and I don't find THE solution for stock data with security. I would like users to be able to post HTML/XML <code> (with WMD) on my site.
For the moment, I stock data in the Markdown format, but if I disable JavaScript the user can easily push XSS. If I strip_tags
or html_entities
all data I lose the user HTML/XML <code>. How can I do it?
In my opinion I must html_entities
just the code between pre /pre, but how?! My data is in Markdown.
After, what can I do to forbid XSS attributes:
<img src="javascript:alert('xss');" />
To "clean" your HTML, you could use a tool like
HTML Purifier
Basically, it allows you to specify which tags/attributes are allowed, an only keeps those.
It also produces valid (X)HTML code as ouput -- which is nice.
You can see on the demo page there is an example that is almost exactly the XSS you posted, btw ;-)
For instance, you can try with some HTML like this one :
The output is :
The
img
tag with XSS has not been kept ; the other one has ; and there's been analt
attribute added, to be standard-compliant.It might not solve all your problems, but if you are giving users the possiblity to input HTML, is it definitly useful (would I dare saying "it's a must-have" ? )