Server side HTML sanitizer/cleanup for JSF

2019-01-12 12:23发布

问题:

Is there any HTML sanitizer or cleanup methods available in any JSF utilities kit or libraries like PrimeFaces/OmniFaces?

I need to sanitize HTML input by user via p:editor and display safe HTML output using escape="true", following the stackexchange style. Before displaying the HTML I'm thinking to store sanitized input data to the database, so that it is ready to safe use with escape="true" and XSS is not a danger.

回答1:

In order to achieve that, you basically need a standalone HTML parser. HTML parsing is rather complex and the task and responsibility of that is beyond the scope of JSF, PrimeFaces and OmniFaces. You're supposed to just grab one of the many existing HTML parsing libraries.

An example is Jsoup, it has even a separate method for the particular purpose of sanitizing HTML against a Whitelist: Jsoup#clean(). For example, if you want to allow some basic HTML without images, use Whitelist.basic():

String sanitizedHtml = Jsoup.clean(rawHtml, Whitelist.basic());

A completely different alternative is to use a specific text formatting syntax, such as Markdown (which is also used here). Basically all of those parsers also sanitize HTML under the covers. An example is Pegdown. Perhaps this is what you actually meant when you said "stackexchange style".

As to saving in DB, you'd better save both the raw and parsed forms in 2 separate text columns. The raw form should be redisplayed during editing. The parsed form should be updated in background when the raw form has been edited. During display, obviously only show the parsed form with escape="false".

See also:

  • Markdown or HTML