I'm fooling around with Yahoo! pipes and I'm hitting a wall with some regular expression. Now I'm familiar with regular expressions from Perl but the rules just seem to be different in Yahoo! pipes.
What I'm doing is fetching a page and trying to turn it into a feed, my regex for stripping out the link from the HTML works fine but the title which I want to be what was in <i> tags just outputs the original text.
Sample text that matches in Perl and on this online regexp tester:
<a rel="nofollow" target="_blank" HREF="http://changed.to/protect/the-guilty.html"><i>"Fee Fi Fo Fun" (English Man)</i></a> (See also this other site <a rel="nofollow" target="_blank" href="http://stackoverflow.com">Nada</a>) Some other text here
One important thing to watch out for with YP is do not trust the debug screen, it has a small quirk of hiding some tags from view that can cause no end of confusion when attempting regexing. To expose any hidden html replace '<' with something like '#'
RegEx for the title:
RegEx for the link:
Somehow the case-insensitive checkbox seems broken. Luckily you can substitute with
(?i)
, which works nicely.Here is a nice web2.0-ish tool to test regular expressions with: RegExr. But for some reason it's still beta. ;-)