i have added a simple wysiwyg editor in my website. (it only allows B / I / U - no more)
I currently store all content as html in my database - but atm it's simple to add <a onclick='...'>
or other malicious code)
Whats the best way in PHP to parse the input safely?
How to implement <b></b>
<i></i>
and so on as whitelist and encode everything else?
HTMLPurifier
I'm just going to throw this one out there and probably get the inevitable lashing. I would not use strip_tags
to secure your WYSIWYG form... ever (Unless you want to piss off your users).
It won't secure your form, and you may be killing your user's experience.
Chris Shiftlett in his blog post wrote an excellent paragraph
I detest commenting on blogs where my comment is passed through something like strip_tags(), effectively mangling what I'm trying to say. It reminds me of using an IM client that tries to identify smilies and replace them with images, often making responses difficult to decipher.
Another Reason
Someone else in another answer also wrote this which I like:
$str = "10 appels is <than 12 apples";
var_dump(strip_tags($str));
The output I get is:
string '10 appels is ' (length=13)
I personally would not use anything other than HTMLPurifier
HTMLPurifier
HTMLPurifier
Try a demo here: http://htmlpurifier.org/demo.php
And look at this similar question
Use strip_tags()
. http://php.net/manual/en/function.strip-tags.php
string strip_tags ( string $str [, string $allowable_tags ] )
The second parameter is a list of allowable tags; just list '<b><i><u>'
and the rest will be stripped.
Do note that as deceze mentioned:
This function does not modify any attributes on the tags that you
allow using allowable_tags, including the style and onmouseover
attributes that a mischievous user may abuse when posting text that
will be shown to other users.
So it doesn't offer full protection from malicious code by itself!