if I'm sanitizing my DB inserts, and also escaping the HTML I write with htmlentities($text, ENT_COMPAT, 'UTF-8')
- is there any point to also filtering the inputs with xss_clean? What other benefits does it give?
相关问题
- Views base64 encoded blob in HTML with PHP
- Laravel Option Select - Default Issue
- PHP Recursively File Folder Scan Sorted by Modific
- Can php detect if javascript is on or not?
- Is there a way to play audio on a mobile browser w
In your case, "stricter methods are fine, and lighter weight". CodeIgniter developers intend xss_clean() for a different use case, "a commenting system or forum that allows 'safe' HTML tags". This isn't clear from the documentation, where xss_clean is shown applied to a username field.
There's another reason to never use xss_clean(), that hasn't been highlighted on stackoverflow so far. xss_clean() was broken during 2011 and 2012, and it's impossible to fix completely. At least without a complete redesign, which didn't happen. At the moment, it's still vulnerable to strings like this:
The current implementation of xss_clean() starts by effectively applying urldecode() and html_entity_decode() to the entire string. This is needed so it can use a naive check for things like "javascript:". At the end, it returns the decoded string.
An attacker can simply encode their exploit twice. It will be decoded once by xss_clean(), and pass as clean. You then have a singly-encoded exploit, ready for execution in the browser.
I call these checks "naive" and unfixable because they're largely reliant on regular expressions. HTML is not a regular language. You need a more powerful parser to match the one in the browser; xss_clean() doesn't have anything like that. Maybe it's possible to whitelist a subset of HTML, which lexes cleanly with regular expressions. However, the current xss_clean() is very much a blacklist.
Yes you should still be using it, I generally make it a rule to use it at least on public facing input, meaning any input that anyone can access and submit to.
Generally sanitizing the input for DB queries seems like a side-effect as the true purpose of the function is to prevent Cross-site Scripting Attacks.
I'm not going to get into the nitty gritty details of every step xss_clean takes, but i will tell you it does more than the few steps you mentioned, I've pastied the source of the xss_clean function so you can look yourself, it is fully commented.
I would recommend using http://htmlpurifier.org/ for doing XSS purification. I'm working on extending my CodeIgniter Input class to start leveraging it.
If you want the filter to run automatically every time it encounters POST or COOKIE data you can enable it by opening your application/config/config.php file and setting this: $config['global_xss_filtering'] = TRUE;
You can enable csrf protection by opening your application/config/config.php file and setting this: $config['csrf_protection'] = TRUE;
for more details, please see on following link.
https://ellislab.com/codeigniter/user-guide/libraries/security.html
xss_clean() is extensive, and also silly. 90% of this function does nothing to prevent xss. Such as looking for the word
alert
but notdocument.cookie
. No hacker is going to usealert
in their exploit, they are going to hijack the cookie with xss or read a CSRF token to make an XHR.However running
htmlentities()
orhtmlspecialchars()
with it is redundant. A case wherexss_clean()
fixes the issue andhtmlentities($text, ENT_COMPAT, 'UTF-8')
fails is the following:A simple poc is:
This will add the
onload=
event handler to the image tag. A method of stoppipng this form of xss ishtmlspecialchars($var,ENT_QUOTES);
or in this casexss_clean()
will also prevent this.However, quoting from the xss_clean() documentation:
That being said, XSS is an
output problem
not aninput problem
. For instance this function cannot take into account that the variable is already within a<script>
tag or event handler. It also doesn't stop DOM Based XSS. You need to take into consideration how you are using the data in order to use the best function. Filtering all data on input is a bad practice. Not only is it insecure but it also corrupts data which can make comparisons difficult.