Split HTML string into array based on parent-level

2019-09-01 16:43发布

问题:

I need to convert the HTML string returned by the_content(); in Wordpress to an array of each parent-level element. For example:

<h3>My subtitle</h3>
<p>Some content here</p>
<blockquote><p>A blockquote goes here</p></blockquote>

Would become:

array['<h3>My subtitle</h3>', '<p>Some content here</p>', '<blockquote> <p>A blockquote goes here</p></blockquote>']

The reason we want to do this is to insert an ad into the content-- after the first paragraph if the first paragraph or content block is greater than 670 characters, or after the second paragraph if the content is shorter than that. The challenge is if either of those paragraphs are wrapped by another element, or if another element is involved at all.

This is the code I currently have:

$content = apply_filters('the_content', get_the_content());
$content = explode("</p>", $content);
$firstParagraphLength = strlen($content[0]);

if($firstParagraphLength > 670) {
    $paragraphAdAfter = 1;
} else {
    $paragraphAdAfter = 2;
}

// If this is after the target paragraph, insert ad code first
for ($i = 0; $i <count($content); $i++) {
    if ($i == $paragraphAdAfter) { ?>
        <!-- AD CODE -->
        My ad code goes here, great!
    <?php
    }
        echo $content[$i] . "</p>";
} ?>

This actually works, but if a blockquote is involved in either the first paragraph or the second, the ad is inserted into the blockquote element. The data is pretty dynamic, so I need to figure out a way to split based on the parent-level elements, whether they are blockquotes, paragraphs, headlines, etc.

回答1:

Try the below Code Snippet Using DOMDocument

$string = '
<h3>My subtitle</h3>
<p>Some content here</p>
<blockquote><p>A blockquote goes here</p></blockquote>
';

$dom = new DOMDocument;
$dom->loadHTML($string);

foreach($dom->getElementsByTagName('*') as $node)
{
    $array[] = $dom->saveHTML($node);
}

print_r($array);

Demo URL:
http://sandbox.onlinephpfunctions.com/code/e382a845f121f8c4a56595f075a9b1d9fee2d2de



标签: php wordpress