php regular expression to get the specific url

2019-09-04 17:04发布

I would like to get the urls from a webpage that starts with "../category/" from these tags below:

<a href="../category/product/pc.html" target="_blank">PC</a><br>
<a href="../category/product/carpet.html" target="_blank">Carpet</a><br>

Any suggestion would be very much appreciated.

Thanks!

标签: php regex url
2条回答
Fickle 薄情
2楼-- · 2019-09-04 17:23

No regular expressions is required. A simple XPath query with DOM will suffice:

$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);

$nodes = $xpath->query('//a[starts-with(@href, "../category/")]');
foreach ($nodes as $node) {
    echo $node->nodeValue.' = '.$node->getAttribute('href').PHP_EOL;
}

Will print:

PC = ../category/product/pc.html
Carpet = ../category/product/carpet.html
查看更多
神经病院院长
3楼-- · 2019-09-04 17:36

This regex searches for your ../category/ string:

preg_match_all('#......="(\.\./category/.*?)"#', $test, $matches);

All text literals are used for matching. You can replace the ..... to make it more specific. Only the \. need escaping. The .*? looks for a variable length string. And () captures the matched path name, so it appears in $matches. The manual explains the rest of the syntax. http://www.php.net/manual/en/book.pcre.php

查看更多
登录 后发表回答