
Regular expression on Yahoo! pipes

2020-03-26 06:21发布


I'm fooling around with Yahoo! pipes and I'm hitting a wall with some regular expression. Now I'm familiar with regular expressions from Perl but the rules just seem to be different in Yahoo! pipes.

What I'm doing is fetching a page and trying to turn it into a feed, my regex for stripping out the link from the HTML works fine but the title which I want to be what was in <i> tags just outputs the original text.

Sample text that matches in Perl and on this online regexp tester:

<a rel="nofollow" target="_blank" HREF="http://changed.to/protect/the-guilty.html"><i>"Fee Fi Fo Fun" (English Man)</i></a> (See also this other site <a rel="nofollow" target="_blank" href="http://stackoverflow.com">Nada</a>) Some other text here


RegEx for the title:

(?i).*?<i>([^<]*).*               [ ] g  [x] s  [ ] m  [ ] i

RegEx for the link:

(?i).*?href="([^"]*).*            [ ] g  [x] s  [ ] m  [ ] i

Somehow the case-insensitive checkbox seems broken. Luckily you can substitute with (?i), which works nicely.

Here is a nice web2.0-ish tool to test regular expressions with: RegExr. But for some reason it's still beta. ;-)


One important thing to watch out for with YP is do not trust the debug screen, it has a small quirk of hiding some tags from view that can cause no end of confusion when attempting regexing. To expose any hidden html replace '<' with something like '#'