问题:

I am new to Regex. I have a string like:

Hello <b>ABCD</b> World
or 
<b>ABCD</b>Hello World

I basically want to retain the text inside bold tags but remove all other characters in the string.

I have found the code to remove bold part in the string:

$string = 'This is <b>an</b> example <b>text</b>';
echo preg_replace('/(<b>.+?)+(<\/b>)/i', '', $string);

So how do I make it to work in opposite way?

Regards Ahmar

回答1:

Use a DOM parser instead of a regex if you want to extract data from a HTML or XML document. While a regex will work in simple cases too, it can get weird if the use case gets more complicated or the input data changes in an unexpected way. A DOM parser is more stable and convenient for that purpose.

Example code:

$doc = new DOMDocument();
$doc->loadHTML('Hello <b>ABCD</b> World');

foreach($doc->getElementsByTagName('b') as $element) {
    echo $element->nodeValue;
}

回答2:

use preg_match_all:

preg_match_all("'<b>(.*?)</b>'si", $text, $match);

foreach($match[1] as $val)
{
    echo $val."<br>";
}

回答3:

Try this

function getTextBetweenTags($string, $tagname) {
$pattern = "/<$tagname ?.*>(.*)<\/$tagname>/";
preg_match_all($pattern, $string, $matches);
return $matches[1];
}

$str = 'This is <b>an example text</b>';
$txt = getTextBetweenTags($str, "b");
print_r($txt);