User input filtering - do I need to filter HTML?

2019-03-21 01:35发布

问题:

Note: I take care of SQL injection and output escaping elsewhere - this question is about input filtering only, thanks.

I'm in the middle of refactoring my user input filtering functions. Before passing the GET/POST parameter to a type-specific filter with filter_var() I do the following:

  • check the parameter encoding with mb_detect_encoding()
  • convert to UTF-8 with iconv() (with //IGNORE) if it's not ASCII or UTF-8
  • clean white-spaces with a function found on GnuCitizen.org
  • pass the result thru strip_tags() - no tags allowed at all, Markdown only

Now the question: does it still make sense to pass the parameter to a filter like htmLawed or HTML Purifier, or can I think of the input as safe? It seems to me that these two differ mostly on the granularity of allowed HTML elements and attributes (which I'm not interested into, as I remove everything), but htmLawed docs have a section about 'dangerous characters' that suggests there might be a reason to use it. In this case, what would be a sane configuration for it?

回答1:

There are many different approaches to XSS that are secure. The only why to know if your approach holds water is to test though exploitation. I recommend using a Free XSS vulnerability Scanner*, or the open source wapiti.

To be honest I'll never use strip_tags() becuase you don't always need html tags to execute javascript! I like htmlspecialchars($var,ENT_QUOTES); .

For instance this is vulnerable to xss:

print('<A HREF="http://www.xssed.com/'.strip_tags($_REQUEST[xss]).'">link</a>');

You don't need <> to execute javascript in this case because you can use onmouseover, here is an example attack:

$_REQUEST[xss]='" onMouseOver="alert(/xss/)"';

The ENT_QUOTES will take care of the double quotes which will patch this XSS vulnerability.

*I am affiliated with this site/service.



回答2:

i think what you're doing is safe, at least from my point of view no html code should get through your filter