I'm trying to extract definite part of a file such as below:
1443113312 mongo client connection created with mongodb://172.28.128.5:27017
1443113312 [OVERALL], RunTime(ms), 4864.0
1443113313 [READ], Return=0, 485
1443113313 [CLEANUP], 99thPercentileLatency(us), 4487.0
1443113314 [UPDATE], 99thPercentileLatency(us), 27743.0
This is the output I'm expecting:
mongodb://172.28.128.5 Operations=OVERALL 1443113312
mongodb://172.28.128.5 Operations=READ 1443113313
mongodb://172.28.128.5 Operations=CLEANUP 1443113313
mongodb://172.28.128.5 Operations=UPDATE 1443113314
I really appreciate any suggestion. Thanks.
$ awk -F'[][ \t:]+' '/mongodb/{a=$(NF-2)":"$(NF-1);next} a{printf "%s Operations=%-7s %s\n",a,$2,$1}' file
mongodb://172.28.128.5 Operations=OVERALL 1443113312
mongodb://172.28.128.5 Operations=READ 1443113313
mongodb://172.28.128.5 Operations=CLEANUP 1443113313
mongodb://172.28.128.5 Operations=UPDATE 1443113314
How it works
-F'[][ \t:]+'
This sets the field separator to any combination of spaces, tabs, colons, or square brackets ([]
).
/mongodb/{a=$(NF-2)":"$(NF-1);next}
If the line contains mongodb
, then we save the third and second to last fields in the variable a
.
a{printf "%s Operations=%-7s %s\n",a,$2,$1}
If the variable a
has been assigned a value, then print out the current reformatted as per the question.
Variation
This produces the mongo string but not IP and puts the operation in parens:
$ awk -F'[][ \t:]+' '/mongodb/{a=$(NF-2);next} a{printf "%s\tOperations=\"%s\"\t%s\n",a,$2,$1}' file
mongodb Operations="OVERALL" 1443113312
mongodb Operations="READ" 1443113313
mongodb Operations="CLEANUP" 1443113313
mongodb Operations="UPDATE" 1443113314
Perl to the rescue!
perl -nwe 'if (m=mongo client connection created with (mongodb://[0-9.]+)=) {
$url = $1;
} elsif (/^([0-9]+) \[([[:upper:]]+)\]/) {
print "$url Operations=$2 $1\n";
}' input-file
Explanation: -n
reads the input line by line. Each time the "created" string is encountered, the URL is saved in the $url variable. Each time a number (timestamp?) plus upper case word in square brackets is encountered, the URL with the action and timestamp are printed.
This might work for you (GNU sed & printf):
sed -rn '\|://|h;G;s/^(\S+) \[(\S+)\].* (\S+):.*/printf "%s Operations=%-7s %s" \3 \2 \1/ep' file
This use GNU sed's e
flag which evaluates the pattern space. Alternatively the evaluation can be done in a separate process by piping the printf
commands to a shell, so:
sed -rn '\|://|h;G;s/^(\S+) \[(\S+)\].* (\S+):.*/printf "%s Operations=%-7s %s\n" \3 \2 \1/p' | sh