Fixing malformed HTML in PHP?

2019-06-18 22:29发布

I am constructing a large HTML document from fragments supplied by users that have the annoying habit of being malformed in various ways. Browsers are robust and forgiving enough but I want to be able to validate and (ideally) fix any malformed HTML if at all possible. For example:

<td><b>Title</td>

can be reasonably fixed to:

<td><b>Title</b></td>

Is there a way of doing this easily in PHP?

标签: php html parsing
3条回答
一纸荒年 Trace。
2楼-- · 2019-06-18 22:32

If you can't use Tidy (sometimes hosting service do not activate this php module), you can use this PHP class: http://www.barattalo.it/html-fixer/

查看更多
闹够了就滚
3楼-- · 2019-06-18 22:47

You can use HTML Tidy, man pages are here.

查看更多
祖国的老花朵
4楼-- · 2019-06-18 22:53

I highly recommend HTML Purifier. From their site:

HTML Purifier is a standards-compliant HTML filter library written in PHP. HTML Purifier will not only remove all malicious code (better known as XSS) with a thoroughly audited, secure yet permissive whitelist, it will also make sure your documents are standards compliant, something only achievable with a comprehensive knowledge of W3C's specifications. Tired of using BBCode due to the current landscape of deficient or insecure HTML filters? Have a WYSIWYG editor but never been able to use it? Looking for high-quality, standards-compliant, open-source components for that application you're building? HTML Purifier is for you!

查看更多
登录 后发表回答