strip images from get_the_content() removes also <

2020-08-01 08:23发布

问题:

I'm trying to strip all images and the surrounding <a> tag from get_the_content() with this piece of code:

<?php
$content = get_the_content();
$postOutput = preg_replace(array('{<a[^>]*><img[^>]+.}','{></a>}'),'', $content);
echo $postOutput;
?>

That works fine, except there are no <p> tags around the paragraphs.
1. Is this normal when using get_the_content()?
2. and how could I add them to my result?
3. Or is my regex wrong?

回答1:

Okay I will answer my own questions:

  1. Yes it is normal that there are no <p> on get_the_content() thanks @s_ha_dum
  2. To add the <p> I need to apply the content filter as mentioned here (scroll down to "Alternative Usage").
  3. The regex is not wrong but it is pretty dirty to regex the html. Thanks @MarcB

See the new question that lead to the following code here

$dom = new DOMDocument;
$dom->loadHTML(get_the_content());
$xpath = new DOMXPath($dom);
$nodes = $xpath->query('//img|//a[img]');
foreach($nodes as $node) {
    $node->parentNode->removeChild($node);
}
$no_image_content = $dom->saveHTML();
$no_image_content = apply_filters('the_content', $no_image_content);
$no_image_content = str_replace(']]>', ']]&gt;', $no_image_content);
echo $no_image_content;