PHP DomDocument output without <?xml version=“1

2020-02-05 11:37发布

问题:

is there an option with DomDocument to remove the first line:

<?xml version="1.0" encoding="UTF-8"?>

The class instantiation automatically adds it to the output, but is it possible to get rid of it?

回答1:

If you want to output HTML, use the saveHTML() function. It automatically avoids a whole lot of XML idiom and handles closed/unclosed HTML idiom properly.

If you want to output XML you can use the fact that DOMDocument is a DOMNode (namely: '/' in XPath expression), thus you can use DOMNode API calls on it to iterate over child nodes and call saveXML() on each child node. This does not output the XML declaration, and it outputs all other XML content properly.

Example:

$xml = get_my_document_object();
foreach ($xml->childNodes as $node) {
   echo $xml->saveXML($node);
}


回答2:

I think using DOMDocument is a universal solution for valid XML files:

If you have xml allready loaded in a variable:

$t_xml = new DOMDocument();
$t_xml->loadXML($xml_as_string);
$xml_out = $t_xml->saveXML($t_xml->documentElement);

For XML file from disk:

$t_xml = new DOMDocument();
$t_xml->load($file_path_to_xml);
$xml_out = $t_xml->saveXML($t_xml->documentElement);

This comment helped: http://www.php.net/manual/en/domdocument.savexml.php#88525



回答3:

You can use output buffering to remove it. A bit of a hack but it works.

ob_start();

// dom stuff

$output = ob_get_contents();
ob_end_clean();

$clean = preg_replace("/(.+?\n)/","",$output);


回答4:

For me, none of the answers above worked:

$dom = new \DOMDocument();
$dom->loadXXX('<?xml encoding="utf-8" ?>' . $content);  // loadXML or loadHTML
$dom->saveXML($dom->documentElement);

The above didn't work for me if I had partial HTML, e.g.

<p>Lorem</p>
<p>Ipsum</p>

As it then removed the everything after <p>Lorem</p>.

The only solution that worked for me was:

foreach ($doc->childNodes as $xx) {
    if ($xx instanceof \DOMProcessingInstruction) {
        $xx->parentNode->removeChild($xx);
    }
}