PHP:由孩子的内容两个标记之间提取字符串[复制](PHP : Extracting string

2019-10-21 04:19发布

这个问题已经在这里有一个答案:

  • 你如何解析和PHP程序的HTML / XML? 30个回答

我有这个下面的HTML标记:

<ul>
    <li>
        <strong>Online:</strong>
        2/14/2010 3:40 AM
    </li>
    <li>
        <strong>Hearing Impaired:</strong>
        No
        </li>
    <li>
        <strong>Downloads:</strong>
        3,840
    </li>
</ul>

我想赶上3,840从去年li通过"Downloads:"

你有什么建议?

我尝试:

preg_match('/<li><strong>Downloads:<\/strong>(.*?)<\/li>/s', $s, $a);

Answer 1:

我建议在这里使用一个HTML解析器, DOMDocument尤其是使用XPath。

例:

$markup = '<ul>
    <li>
        <strong>Online:</strong>
        2/14/2010 3:40 AM
    </li>
    <li>
        <strong>Hearing Impaired:</strong>
        No
    </li>
    <li>
        <strong>Downloads:</strong>
        3,840
    </li>
</ul>';

$dom = new DOMDocument();
$dom->loadHTML($markup);
$xpath = new DOMXpath($dom);
// this just simply means get the string next on that strong tag with a text of Downloads:
$download = trim($xpath->evaluate("string(//strong[text()='Downloads:']/following-sibling::text())"));
echo $download; // 3,840


Answer 2:

使用HTML解析器解析HTML文件。 如果你坚持的正则表达式,那么你可以试试下面的,

<li>[^<>]*<strong>Downloads:<\/strong>\s*\K.*?(?=\s*<\/li>)

DEMO

码:

$string = <<<EOT
<ul>
    <li>
        <strong>Online:</strong>
        2/14/2010 3:40 AM
    </li>
    <li>
        <strong>Hearing Impaired:</strong>
        No
    </li>
    <li>
        <strong>Downloads:</strong>
        3,840
    </li>
</ul>
EOT;
$regex = "~<li>[^<>]*<strong>Downloads:<\/strong>\s*\K.*?(?=\s*<\/li>)~s";
if (preg_match($regex, $string, $m)) {
    $yourmatch = $m[0]; 
    echo $yourmatch;
    } // 3,840


文章来源: PHP : Extracting string between two tags by childs content [duplicate]