How can I search for a multiline pattern in a file

2019-01-01 00:40发布

I needed to find all the files that contained a specific string pattern. The first solution that comes to mind is using find piped with xargs grep:

find . -iname '*.py' | xargs grep -e 'YOUR_PATTERN'

But if I need to find patterns that spans on more than one line, I'm stuck because vanilla grep can't find multiline patterns.

11条回答
孤独总比滥情好
2楼-- · 2019-01-01 01:07

So I discovered pcregrep which stands for Perl Compatible Regular Expressions GREP.

For example, you need to find files where the '_name' variable is immediatelly followed by the '_description' variable:

find . -iname '*.py' | xargs pcregrep -M '_name.*\n.*_description'

Tip: you need to include the line break character in your pattern. Depending on your platform, it could be '\n', \r', '\r\n', ...

查看更多
泪湿衣
3楼-- · 2019-01-01 01:11

Why don't you go for awk:

awk '/Start pattern/,/End pattern/' filename
查看更多
宁负流年不负卿
4楼-- · 2019-01-01 01:12

grep -P also uses libpcre, but is much more widely installed. To find a complete title section of an html document, even if it spans multiple lines, you can use this:

grep -P '(?s)<title>.*</title>' example.html

Since the PCRE project implements to the perl standard, use the perl documentation for reference:

查看更多
只靠听说
5楼-- · 2019-01-01 01:15

@Marcin: awk example non-greedy:

awk '{if ($0 ~ /Start pattern/) {triggered=1;}if (triggered) {print; if ($0 ~ /End pattern/) { exit;}}}' filename
查看更多
君临天下
6楼-- · 2019-01-01 01:17

With silver searcher:

ag 'abc.*(\n|.)*efg'

Speed optimizations of silver searcher could possibly shine here.

查看更多
萌妹纸的霸气范
7楼-- · 2019-01-01 01:19

You can use the grep alternative sift here (disclaimer: I am the author).

It support multiline matching and limiting the search to specific file types out of the box:

sift -m --files '*.py' 'YOUR_PATTERN'

(search all *.py files for the specified multiline regex pattern)

It is available for all major operating systems. Take a look at the samples page to see how it can be used to to extract multiline values from an XML file.

查看更多
登录 后发表回答