I'd like to "grab" a few hundred urls from a few hundred html pages.
Pattern:
<h2><a href="http://www.the.url.might.be.long/urls.asp?urlid=1" target="_blank">The Website</a></h2>
I'd like to "grab" a few hundred urls from a few hundred html pages.
Pattern:
<h2><a href="http://www.the.url.might.be.long/urls.asp?urlid=1" target="_blank">The Website</a></h2>
Here is how to do it properly with the native DOM extensions
Note that the above will also find relative links. If you don't want those adjust the Xpath to
Note that using Regex to match HTML is the road to madness. Regex matches string patterns and knows nothing about HTML elements and attributes. DOM does, which is why you should prefer it over Regex for every situation that goes beyond matching a supertrivial string pattern from Markup.
But better use HTML Parser, an example here with PHP Simple HTML DOM