SED to parse apache logs between timestamp

2019-07-25 10:21发布

I am trying to parse a log and get the lines between timestamp.Tried sed approach like below but facing issue with regex

Log pattern:

IP - - [20/Apr/2018:14:25:37 +0000] "GET / HTTP/1.1" 301 3936 "-" "
IP - - [20/Apr/2018:14:44:08 +0000]
----------------------------------

IP- - [20/Apr/2018:20:43:46 +0000]

I need to get the lines between 14:25 and 20:43 for 20th april as the log contains other dates also.

Tried this:

sed -n '/\[14:25/,/\[20:43/p' *-https_access.log.1

but not working.

4条回答
萌系小妹纸
2楼-- · 2019-07-25 10:59

To print lines between match1 and match2 with sed or awk you can do:

sed -n '/match1/,/match2/p' inputfile
awk '/match1/,/match2/' inputfile

in your example match1 is 20/Apr/2018:14:25 and match2 is 20/Apr/2018:20:43. So any of these commands should work for you:

sed -n '/20\/Apr\/2018:14:25/,/20\/Apr\/2018:20:43/p' inputfile
awk '/20\/Apr\/2018:14:25/,/20\/Apr\/2018:20:43/' inputfile

or use | as a sed's delimiter to prevent escaping slash:

sed -n '\|20/Apr/2018:14:25|,\|20/Apr/2018:20:43|p' inputfile
查看更多
Juvenile、少年°
3楼-- · 2019-07-25 11:02

sed is not appropriate because it's hard to compare element (like day and hour).

with awk (self commented):

awk -F '[ []' '
  {
  # separt date and hour then rebuild the fields
  sub(/:/, " ", $5);$0=$0""
  }

  # print if it s the day and between the 2 hour (string compare works in this case)
  $5 ~ /20.Apr.2018/ && $6 >= "04:25" &&  $7 < "20:44"
  ' YourFile

more generaly, we can use variable to give date and hour as paramter to the awk (not the purpose here)

查看更多
Bombasti
4楼-- · 2019-07-25 11:05

Since you mentioned you want logs for 20th April, I'd suggest something like :

$ sed -n '/20\/Apr\/2018:14:25/,/20\/Apr\/2018:20:43/p' *-https_access.log.1

This is very less likely to conflict with false matches in case "20:43" occurs elsewhere.

查看更多
Rolldiameter
5楼-- · 2019-07-25 11:18

The best solution is to use awk for this. What you need to do is convert your time-stamps to a unix-time and then do the comparisons. In awk you can do this using mktime()

mktime(datespec [, utc-flag ]): Turn datespec into a timestamp in the same form as is returned by systime(). It is similar to the function of the same name in ISO C. The argument, datespec, is a string of the form YYYY MM DD HH MM SS [DST]. The string consists of six or seven numbers representing, respectively, the full year including century, the month from 1 to 12, the day of the month from 1 to 31, the hour of the day from 0 to 23, the minute from 0 to 59, the second from 0 to 60,55 and an optional daylight-savings flag.

In order to convert your time-format of the form 20/Apr/2018:14:25:37 +0000 into 2018 04 20 14 25 37 +0000

awk -v tstart="20/Apr/2018:14:25:00" -v tend = "20/Apr/2018:20:43:00" \
     'function tounix(str) {
        split(str,a,"/|:| ")
        return mktime(a[3]" "month[a[2]]" "a[1]" "a[4]" "a[5]" "a[6])
     }
     BEGIN{
       month["Jan"]="01";month["Feb"]="02";month["Mar"]="03"
       month["Apr"]="04";month["May"]="05";month["Jun"]="06"
       month["Jul"]="07";month["Aug"]="08";month["Sep"]="09"
       month["Oct"]="10";month["Nov"]="11";month["Dec"]="12"
       FS="\\[|\\]"
       t1=tounix(tstart)
       t2=tounix(tend)
     }
     { t=tounix($2) }
     (t1<=t && t<=t)' <file>

This method is robust as it will do true time comparisons which are independent of leap years, day/month/year-cross-overs, ... In contrast to other solutions provided, this method also does not require the existence of the date tstart and tend in the file

查看更多
登录 后发表回答