I want to get the HTML inside the parent element. For example, I have this structure:
<div>
<div>text<b>more text</b>and <i>some more</i></div>
</div>
and I want to get text<b>more text</b>and <i>some more</i>
as a result.
Here's my code:
$dom = new DOMDocument();
$dom->loadhtml($html);
$xpath = new DOMXPath($dom);
$text = $xpath->query("//div/div");
$html = $dom->saveHTML($text->item(0));
And the result is
<div>text<b>more text</b>and <i>some more</i></div>
I thought of using preg_replace but it's not a good idea. How can I remove the parent element using XPath?
You might need
$html = '';
foreach ($text->item(0)->childNodes as $child) {
$html .= $dom->saveHTML($child);
}
That's pseudo code iterating over the child nodes of the div
element node, I hope I got the PHP syntax right.
Instead of looking onto your problem to remove the parent (which is confronting yourself with the problematic output and then thinking you need to remove something), just turn it 180° around and consider to not add it in the first place. That is saving the HTML of all child-nodes of that div.
First the xpath expression for all child-nodes of //div/div
:
//div/div/node()
That means in xpath to query any node-type, so not only element-nodes for example but also the text-nodes which you need here.
So you now want to use $dom->saveHTML()
on all these nodes. This can be done by mapping that function call onto all those items:
$inner = $xpath->query("//div/div/node()");
$html = implode('', array_map([$dom, 'saveHTML'], iterator_to_array($inner)));
This will make $html
the following:
text<b>more text</b>and <i>some more</i>
Instead of mapping you can also use the bit more verbose code that is probably more easy to read:
$inner = $xpath->query("//div/div/node()");
$html = '';
foreach($inner as $node) {
$html .= $dom->saveHTML($node);
}
Compared with the previous answer you can see it's similar but a bit more simplified because it uses the xpath expression to query the elements to save directly.