I have this HTML code, that's on a single line:
<h3 class='r'><a href="www.google.com">fkdsafjldsajl</a></h3><h3 class='r'><a href="www.google.com">fkdsafjldsajl</a></h3>
Here is the line-friendly version (that i can't use)
<h3 class='r'><a href="www.google.com">fkdsafjldsajl</a></h3>
<h3 class='r'><a href="www.google.com">fkdsafjldsajl</a></h3>
And i'm trying to extract just the URLs, with this REGEX
/<h3 class="r"><a href="(.*)">(.*)<\/a>/
And it returns
www.google.com">fkdsafjldsajl</a></h3><h3 class='r'><a href="www.google.com"
What can I do to stop it when find a " ?
Sigh. Regex and HTML are such awkward bedfellows:
This will find them, whether they are deeply nested or all on one line.
The problem is that
*
is greedy. Put a question mark after it to make it ungreedy.Working regex (tested on rubular)