这个问题已经在这里有一个答案:
- 你如何解析和PHP程序的HTML / XML? 30个回答
我有这个下面的HTML标记:
<ul>
<li>
<strong>Online:</strong>
2/14/2010 3:40 AM
</li>
<li>
<strong>Hearing Impaired:</strong>
No
</li>
<li>
<strong>Downloads:</strong>
3,840
</li>
</ul>
我想赶上3,840
从去年li
通过"Downloads:"
。
你有什么建议?
我尝试:
preg_match('/<li><strong>Downloads:<\/strong>(.*?)<\/li>/s', $s, $a);
我建议在这里使用一个HTML解析器, DOMDocument
尤其是使用XPath。
例:
$markup = '<ul>
<li>
<strong>Online:</strong>
2/14/2010 3:40 AM
</li>
<li>
<strong>Hearing Impaired:</strong>
No
</li>
<li>
<strong>Downloads:</strong>
3,840
</li>
</ul>';
$dom = new DOMDocument();
$dom->loadHTML($markup);
$xpath = new DOMXpath($dom);
// this just simply means get the string next on that strong tag with a text of Downloads:
$download = trim($xpath->evaluate("string(//strong[text()='Downloads:']/following-sibling::text())"));
echo $download; // 3,840
使用HTML解析器解析HTML文件。 如果你坚持的正则表达式,那么你可以试试下面的,
<li>[^<>]*<strong>Downloads:<\/strong>\s*\K.*?(?=\s*<\/li>)
DEMO
码:
$string = <<<EOT
<ul>
<li>
<strong>Online:</strong>
2/14/2010 3:40 AM
</li>
<li>
<strong>Hearing Impaired:</strong>
No
</li>
<li>
<strong>Downloads:</strong>
3,840
</li>
</ul>
EOT;
$regex = "~<li>[^<>]*<strong>Downloads:<\/strong>\s*\K.*?(?=\s*<\/li>)~s";
if (preg_match($regex, $string, $m)) {
$yourmatch = $m[0];
echo $yourmatch;
} // 3,840