Ruby equivalent to “grep -C 5” to get context of l

2020-07-13 09:14发布

问题:

I've searched for this a bit but I must be using the wrong terms - does ruby have a way to grep for a string/regex and also return the surrounding 5 lines (above and below)? I know I could just call "grep -C 5 ..."or even write my own method, but it seems like something ruby would have and I'm just not using the right search terms.

回答1:

You can do it with a regular expression. Here's the string we want to search:

s = %{The first line
The second line
The third line
The fourth line
The fifth line
The sixth line
The seventh line
The eight line
The ninth line
The tenth line
}

EOL is "\n" for me, but for you it might be "\r\n". I'll stick it in a constant:

EOL = '\n'

To simplify the regular expression, we'll define the pattern for "context" just once:

CONTEXT_LINES = 2
CONTEXT = "((?:.*#{EOL}){#{CONTEXT_LINES}})"

And we'll search for any line containing the word "fifth." Note that this regular expression must grab the entire line, including the end-of-line, for it to work:

regexp = /.*fifth.*#{EOL}/

Finally, do the search and show the results:

s =~ /^#{CONTEXT}(#{regexp})#{CONTEXT}/
before, match, after = $1, $2, $3
p before    # => "The third line\nThe fourth line\n"
p match     # => "The fifth line\n"
p after     # => "The sixth line\nThe seventh line\n"


回答2:

Thanks for the contextual grep. I thought I might add, that for when the match comes near the top or bottom and you still want all the lines you can get even without all CONTEXT_LINES lines available, you could change the definition of CONTEXT to be as follows:

CONTEXT = "((?:.*#{EOL}){0,#{CONTEXT_LINES}})"

By default, matches are greedy, so if part or all of CONTEXT_LINES lines available, that's what you'll grab.



回答3:

I don't think you can supply args to grep; based on the api.

You could always write a method. Something along the lines of this:

def new_grep(enum, pattern, lines)
 values = enum.grep(/pattern/).map do |x| 
   index = enum.index(x)
   i = (index - lines < 0) ? 0 : index - lines
   j = (index + lines >= enum.length) ? enum.length-1 : index + lines 
   enum[i..j]
 end
 return values.flatten.uniq
end