I'm adding a #b hash to each link via the DOMDocument class.
$dom = new DOMDocument();
$dom->loadHTML($output);
$a_tags = $dom->getElementsByTagName('a');
foreach($a_tags as $a)
{
$value = $a->getAttribute('href');
$a->setAttribute('href', $value . '#b');
}
return $dom->saveHTML();
That works fine, however the returned output includes a DOCTYPE
declaration and a <head>
and <body>
tag. Any idea why that happens or how I can prevent that?
The real problem is the way the DOM is loaded. Use this instead:
$html->loadHTML($content, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
Please upvote the original answer here.
Adding
$doc->saveHTML(false);
will not work and it will return a error because it expects a node and not bool.The solution I used:
return preg_replace('/^<!DOCTYPE.+?>/', '', str_replace( array('<html>', '</html>', '<body>', '</body>'), array('', '', '', ''), $doc->saveHTML()));
I`m using PHP >5.4
I solved this problem by creating new DOMDocument and copying child nodes from original to new one.
So insted of using
I use:
That's what
DOMDocument::saveHTML()
generally does, yes : generate a full HTML Document, with the Doctype declaration, the<head>
tag, ...Two possible solutions :
saveHTML()
accepts one additional parameter that might help youstr_replace()
or regex or whatever equivalent you can think of to remove the portions of HTML code you don't need.