XPath savehtml remove parent element

2019-09-05 06:32发布

问题:

I want to get the HTML inside the parent element. For example, I have this structure:

<div>
<div>text<b>more text</b>and <i>some more</i></div>
</div>

and I want to get text<b>more text</b>and <i>some more</i> as a result.

Here's my code:

$dom = new DOMDocument();
$dom->loadhtml($html);
$xpath = new DOMXPath($dom);
$text = $xpath->query("//div/div");
$html = $dom->saveHTML($text->item(0));

And the result is

<div>text<b>more text</b>and <i>some more</i></div>

I thought of using preg_replace but it's not a good idea. How can I remove the parent element using XPath?

回答1:

You might need

$html = '';
foreach ($text->item(0)->childNodes as $child) {
  $html .= $dom->saveHTML($child);
}

That's pseudo code iterating over the child nodes of the div element node, I hope I got the PHP syntax right.



回答2:

Instead of looking onto your problem to remove the parent (which is confronting yourself with the problematic output and then thinking you need to remove something), just turn it 180° around and consider to not add it in the first place. That is saving the HTML of all child-nodes of that div.

First the xpath expression for all child-nodes of //div/div:

//div/div/node()

That means in xpath to query any node-type, so not only element-nodes for example but also the text-nodes which you need here.

So you now want to use $dom->saveHTML() on all these nodes. This can be done by mapping that function call onto all those items:

$inner = $xpath->query("//div/div/node()");
$html  = implode('', array_map([$dom, 'saveHTML'], iterator_to_array($inner)));

This will make $html the following:

text<b>more text</b>and <i>some more</i>

Instead of mapping you can also use the bit more verbose code that is probably more easy to read:

$inner = $xpath->query("//div/div/node()");

$html = '';
foreach($inner as $node) {
    $html .= $dom->saveHTML($node);
}

Compared with the previous answer you can see it's similar but a bit more simplified because it uses the xpath expression to query the elements to save directly.



标签: php xpath parent