I have a script that read log files and parse the data to insert them to mysql table..
My script looks like
while read x;do
var=$(echo ${x}|cut -d+ -f1)
var2=$(echo ${x}|cut -d_ -f3)
...
echo "$var,$var2,.." >> mysql.infile
done<logfile
The Problem is that log files are thousands of lines and taking hours....
I read that awk
is better, I tried, but don't know the syntax to parse the variables...
EDIT: inputs are structure firewall logs so they are pretty large files like
@timestamp $HOST reason="idle Timeout" source-address="x.x.x.x" source-port="19219" destination-address="x.x.x.x" destination-port="53" service-name="dns-udp" application="DNS"....
So I'm using a lot of grep
for ~60 variables e.g
sourceaddress=$(echo ${x}|grep -P -o '.{0,0}
source-address=\".{0,50}'|cut -d\" -f2)
if you think perl
will be better I'm open to suggestions and maybe a hint how to script it...
To answer your question, I assume the following rules of the game:
This gives you the following awk script :
It basically does the following :
,
+
, re-parse the line ($0=$0
) and determine the first variable$0=$0
) and determine the second variableThe perl script below might help:
Since,
$&
can result in performance penalty, you could also use the/p
modifier like below :For more on
perl
regex matching refer to [ PerlDoc ]if you're extracting the values in order, something like this will help
you can easily change the output format as well