I've been attempting to parse HTML5-code so I can set attributes/values within the code, but it seems DOMDocument(PHP5.3) doesn't support tags like <nav>
and <section>
.
Is there any way to parse this as HTML in PHP and manipulate the code?
Code to reproduce:
<?php
$dom = new DOMDocument();
$dom->loadHTML("<!DOCTYPE HTML>
<html><head><title>test</title></head>
<body>
<nav>
<ul>
<li>first
<li>second
</ul>
</nav>
<section>
...
</section>
</body>
</html>");
Error
Warning: DOMDocument::loadHTML(): Tag nav invalid in Entity, line: 4 in /home/wbkrnl/public_html/new-mvc/1.php on line 17
Warning: DOMDocument::loadHTML(): Tag section invalid in Entity, line: 10 in /home/wbkrnl/public_html/new-mvc/1.php on line 17
You could also do
This worked for me:
If you need the header tag, change the header with a div tag and use an id. For instance:
It's not the best solution but depending on the situation it can be useful.
Good luck.
No, there is no way of specifying a particular doctype to use, or to modify the requirements of the existing one.
Your best workable solution is going to be to disable error reporting with
libxml_use_internal_errors
:You can filter the errors you get from the parser. As per other answers here, turn off error reporting to the screen, and then iterate through the errors and only show the ones you want:
Here is a
print_r()
of a single error:By matching on the
message
and/or thecode
, these can be filtered out quite easily.HTML5 tags almost always use attributes such as id, class and so on. So the code for replacing will be:
There doesn't seem to be a way to kill warnings but not errors. PHP has constants that are supposed to do this, but they don't seem to work. Here is what is SHOULD work, but doesn't because (bug?)....
http://php.net/manual/en/libxml.constants.php