grep -f file to print in order as a file

2020-03-13 03:30发布

问题:

I have a a requirement to grep patterns from a file but need them in order.

$ cat patt.grep
name1
name2

$ grep -f patt.grep myfile.log
name2:some xxxxxxxxxx
name1:some xxxxxxxxxx

I am getting the output as name2 was found first it was printed then name1 is found it is also printed. But my requirement is to get the name1 first as per the order of patt.grep file.

I am expecting the output as

name1:some xxxxxxxxxx
name2:some xxxxxxxxxx

回答1:

You can pipe patt.grep to xargs, which will pass the patterns to grep one at a time.

By default xargs appends arguments at the end of the command. But in this case, grep needs myfile.log to be the last argument. So use the -I{} option to tell xargs to replace {} with the arguments.

cat patt.grep | xargs -Ihello grep hello myfile.log


回答2:

Use the regexes in patt.grep one after another in order of appearance by reading line-wise:

while read ptn; do grep $ptn myfile.log; done < patt.grep


回答3:

i tried the same situation and easily solved using below command:

I think if your data in the same format as you represent then you can use this.

grep -f patt.grep myfile.log | sort



回答4:

This should do it

awk -F":" 'NR==FNR{a[$1]=$0;next}{ if ($1 in a) {print a[$0]} else {print $1, $1} }' myfile.log patt.grep > z



回答5:

A simple workaround would be to sort the log file before grep:

grep -f patt.grep <(sort -t: myfile.log)

However, this might not yield results in the desired order if patt.grep is not sorted.

In order to preserve the order specified in the pattern file, you might use awk instead:

awk -F: 'NR==FNR{a[$0];next}$1 in a' patt.grep myfile.log


回答6:

This can't be done in grep alone.

For a simple and pragmatic, but inefficient solution, see owlman's answer. It invokes grep once for each pattern in patt.grep.

If that's not an option, consider the following approach:

grep -f patt.grep myfile.log |
 awk -F: 'NR==FNR { l[$1]=$0; next } $1 in l {print l[$1]}' - patt.grep
  • Passes all patterns to grep in a single pass,
  • then sorts them based on the order of patterns in patt.grep using awk:
    • first reads all output lines (passed via stdin, -, i.e., through the pipe) into an assoc. array using the 1st :-based field as the key
    • then loops over the lines of patt.grep and prints the corresponding output line, if any.

Constraints:

  • Assumes that all patterns in patt.grep match the 1st :-based token in the log file, as implied by the sample output data in the question.
  • Assumes that each pattern only matches once - if multiple matches are possible, the awk solution would have to be made more sophisticated.