securely strip html tags in javascript with whitel

2019-04-17 08:27发布

问题:

I want to strip almost every html tag from a string in javascript, allowing only a few basic tags
(& strip their attributes) to prevent Cross-Site-Scripting.

A lot of people say, it shouldn't be done with javascript, because clients might have javascript disabled, causing the filter to break. However my whole project depends on javascript, and no client with disabled javascript will ever see the output, plus I am unable to do it server-side.

(1) Am I right to assume in this case it might be done securely?

bobince recommends to use the DOM (instead of RegEx) to filter the potentially insecure input. I am certainly no XSS expert but because his example depends on the string being inserted to the DOM before the filter does his job, I could imagine it might be insecure because of something like:

var unsecureString = '<img src=".." onload="alert(\'bad\')" />';
$('#alice').update(unsecureString);
filterNodes($('#alice'), {p:[],a:['href']}); // see link above

(2) Can I be certain, the bad event above won't ever fire?

(3) If not: How to avoid such problems, but still use the DOM?

回答1:

have a look at the google caja sanitizer.

https://code.google.com/p/google-caja/wiki/JsHtmlSanitizer