read line by line with awk and parse variables

I have a script that read log files and parse the data to insert them to mysql table..

My script looks like

while read x;do
var=$(echo ${x}|cut -d+ -f1) 
var2=$(echo ${x}|cut -d_ -f3)
...
echo "$var,$var2,.." >> mysql.infile 
done<logfile

The Problem is that log files are thousands of lines and taking hours....

I read that awk is better, I tried, but don't know the syntax to parse the variables...

EDIT: inputs are structure firewall logs so they are pretty large files like

@timestamp $HOST reason="idle Timeout" source-address="x.x.x.x" source-port="19219" destination-address="x.x.x.x" destination-port="53" service-name="dns-udp" application="DNS"....

So I'm using a lot of grep for ~60 variables e.g

sourceaddress=$(echo ${x}|grep -P -o '.{0,0} 
source-address=\".{0,50}'|cut -d\" -f2)

if you think perl will be better I'm open to suggestions and maybe a hint how to script it...

标签： bash parsing awk while-loop line

3条回答

男人必须洒脱

2楼-- · 2019-09-21 07:42

To answer your question, I assume the following rules of the game:

each line contains various variables
each variable can be found by a different delimiter.

This gives you the following awk script :

awk 'BEGIN{OFS=","}
     { FS="+"; $0=$0; var=$1;
       FS="_"; $0=$0; var2=$3;
               ...
       print var1,var2,... >> "mysql.infile"
     }' logfile

It basically does the following :

set the output separator to ,
read line
set the field separator to +, re-parse the line ($0=$0) and determine the first variable
set the field separator to '_', re-parse the line ($0=$0) and determine the second variable
... continue for all variables
print the line to the output file.

0人赞添加讨论(0) 举报

Melony?

3楼-- · 2019-09-21 07:52

The perl script below might help:

perl -ane '/^[^+]*/;printf "%s,",$&;/^([^_]*_){2}([^_]*){1ntf "%s\n",$+' logfile

Since, $& can result in performance penalty, you could also use the /p modifier like below :

perl -ane  '/^[^+]*/p;printf "%s,",${^MATCH};/^([^_]*_){2}([^_]*){1}_.*/;printf "%s\n",$+' logfile

For more on perl regex matching refer to [ PerlDoc ]

0人赞添加讨论(0) 举报

我只想做你的唯一

4楼-- · 2019-09-21 08:01

if you're extracting the values in order, something like this will help

$ awk -F\" '{for(i=2;i<=NF;i+=2) print $i}' file 

idle Timeout
x.x.x.x
19219
x.x.x.x
53
dns-udp
DNS

you can easily change the output format as well

$ awk -F\" -v OFS=, '{for(i=2;i<=NF;i+=2) 
                        printf "%s", $i ((i>NF-2)?ORS:OFS)}' file

idle Timeout,x.x.x.x,19219,x.x.x.x,53,dns-udp,DNS

0人赞添加讨论(0) 举报

read line by line with awk and parse variables

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间