Researched links:
How do you apply htmlentities selectively? and PHP function to strip tags, except a list of whitelisted tags and attributes
They are close but not as expected.
What have I tried?
<?php
define('CHARSET', 'UTF-8');
define('REPLACE_FLAGS', ENT_HTML5);
function htmlcleaned($string) {
$string = htmlentities($string);
return str_replace(
array("<i>", "<b>", "</i>", "</b>", "<p>", "</p>"),
array("<i>", "<b>", "</i>", "</b>", "<p>", "</p>"), $string);
}
echo htmlcleaned("<p>How are you?</p><p><b>This is bold</b></p><p><i>This is italic</i></p><p><u>This is underline</u></p><p><br></p><ul><li>This is list item 1</li><li>This is list item 2</li></ul><p><br></p><ol><li>This is ordered list item 1</li><li>This is ordered list item 2</li></ol><p><a target='_blank' style='color: #1c5c76;' href='http://www.google.com'>http://www.google.com</a></p><p>This is plain text again.<br></p><script>alert('attempt csrf');</script><p><p>This is P tag example</p></p>");
?>
What I want to achieve?
if the input is:
<b><script>alert("something");</script></b>
then the output will be:
<b><script&rt;("something");</script$rt;</b>
There is no specific blacklist but there is a specific white list.
You can use PHP DOM objects to achieve this, first you create an element(In your case it is < b>) and provide encoded string as its body(inner HTML) like below,
You can use builtin function instead of creating a function like this,
This function might help you, it is not highly tested. It will do htmlentities on all the tags except the tags you specify