Simple HTML DOM - replace all occurrences of a cer

2019-08-05 19:56发布

问题:

I was wondering: how would I change all occurrences of a certain word in a HTML, but only outside of tags?

Example: Lets say I want to replace all occurrences of myWordToReplace with <a href="#">myWordToReplace</a>

So this html

<p data-something="myWordToReplace"> myWordToReplace andSomeOtherText</p>

should yield

<p data-something="myWordToReplace"> <a href="#">myWordToReplace</a> andSomeOtherText</p>

I was trying to achieve this with regex, but it's also a mess - I thought perhaps a DOM parser would do the trick? Any help appreciated?

EDIT: @Muhammet's answer will do the trick if all your text is wrapped in some tags - but if parts of your text are without a tag, that text will not be replaced of course. I'm now trying to achieve this too.

Example: if I want to change myWord to someOtherWord:

Nam myWord pharetra <strong>auctor myWord</strong>

Should yield

Nam someOtherWord pharetra <strong>auctor someOtherWord </strong>

but now it only changes the second word - the one inside strong tags.

回答1:

You could do something like this

$html = file_get_html($file_url);
$content = $html->find('text');

foreach($content as $line) {
    if(strpos($line->innertext, 'myWordToReplace') !== false) {
        $line->innertext = str_replace('myWordToReplace','<a href="#">myWordToReplace</a>', $line->innertext);
    }
}


回答2:

Here is another DOM-based solution to wrap parts of text nodes into <a> tags (using search as a sample):

$html = "<html><body>\n<!-- This is a comment for search //-->\n<span class=\"search\">New search performed</span></body></html>";
$key = "search";
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$tt = $xpath->query('//text()');

foreach ($xpath->query('//text()') as $textNode) {
    $fragment = $dom->createDocumentFragment();
    $text = $textNode->nodeValue;

    while (($pos = stripos($text, $key)) !== false) {
      $fragment->appendChild(new DOMText(substr($text, 0, $pos)));
      $word = substr($text, $pos, strlen($key));

      $lnk = $dom->createElement('a');
      $lnk->appendChild(new DOMText($word));
      $lnk->setAttribute('href', '#');
      $fragment->appendChild($lnk);

      $text = substr($text, $pos + strlen($key));
    }
    if (!empty($text))
      $fragment->appendChild(new DOMText($text));
    $textNode->parentNode->replaceChild($fragment, $textNode);
}
echo $dom->saveHTML();

Here is an IDEONE demo



回答3:

This is working great for me. I don't know why it doesn't take the "data-something", it seems that it doesn't like the dash between the two words. But as it works, I hope it is useful for you.

I am using Simple HTML DOM Library

$my_html = '<p data="myWordToReplace"> myWordToReplace </p>';
$html = str_get_html($my_html);

foreach($html->find('p') as $p) {
    if (!empty($p->data) && ($p->data == 'myWordToReplace')) {
        $p->innertext = '<a href="#">'. $p->innertext .'</a>';
    }
}

echo $html;