grep command with a lookahead pattern does not sel

2020-04-15 11:25发布

问题:

I was trying to use the following grep command:

grep '(.*)(?=(png|html|jpg|js|css)(?:\s*))(png|html|jpg|js|css.*\s)' file

File contains the following:

 http://manage.bostonglobe.com/GiftTheGlobe/LandingPage.html
 https://manage.bostonglobe.com/cs/mc/login.aspx?p1=BGFooter
 https://www.bostonglobe.com/bgcs
 /newsletters?p1=BGFooter_Newsletters
 https://bostonglobe.custhelp.com/app/home?p1=BGFooter
 https://bostonglobe.custhelp.com/app/answers/list?p1=BGFooter
 /tools/help/stafflist?p1=BGFooter
 https://www.bostonglobemedia.com/
 https://manage.bostonglobe.com/Order/newspaper/Newspaper.aspx
 https://www.facebook.com/globe
 https://twitter.com/#!/BostonGlobe
 https://plus.google.com/108227564341535363126/about
 https://epaper.bostonglobe.com/launch.aspx?pbid=2c60291d-c20c-4780-9829-     b3d9a12687cf
 http://nieonline.com/bostonglobe/
 https://secure.pqarchiver.com/boston-sub/no_default.html?ss=1&url=%2Fboston-sub%2Fadvancedsearch.html
 /tools/help/privacy?p1=BGFooter
 /tools/help/terms-service?p1=BGFooter
 /termsofpurchase?p1=BGFooter
 https://www.bostonglobemedia.com/careers
 /css/globe-print.css?v=19256I1935
 //meter.bostonglobe.com/css/style.css
 /css/globe-print.css?v=19256I1935
 //cdn.blueconic.net/bostonglobemedia.js
 /js/lib/rwd-images.js,lib/respond.min.js,lib/modernizr.custom.min.js,globe-          define.js,globe-controller.js?v=19256I1935
 data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///ywAAAAAAQABAAACAUwAOw==
 /js/lib/jquery.js,lib/lo-dash-custom-2.4.1.js,lib/a9.js,lib/pb.js,dist/ad-     init.js,globe-newsletter.js,globe-profile-page.js,dist/globe-topic-nav.js,dist/rakuten.js?v=19256I1935
 //dc8xl0ndzn2cb.cloudfront.net/js/bostonglobe/v0/keywee.min.js

For some reason it doesn't select anything from that file, Ive tried different flags but cant seem to figure out whats wrong

回答1:

You are using a PCRE regex with the POSIX BRE engine that is default grep engine.

To make those patterns work, you should use -P option (available in GNU grep):

grep -P 'YOUR_PCRE_PATTERN'
     ^^

To develop and test PCRE patterns, a well-known regex101.com is usually recommended.

Note that on Mac OS, you may install GNU grep via brew.



标签: regex awk grep