I need help as I am new to log parsing. I'm trying to extract all log lines that have a 200 status, with a timestamp of 15 hours before 15:35. I am not able to figure out the regex to be used.
Here is a sample of the log:
198.104.78.160 [26/Dec/2016:15:24:12 -0500] 200 190.50.175.65:8080 200 testtest.com GET /api/bid_request?feed=1&auth=qwerty&ip=85.194.119.3&ua=Mozilla%2F5.0+%28Windows+NT+6.1%3B+Win64%3B+x64%29+AppleWebKit%2F537.36+%28KHTML%2C+like+Gecko%29+Chrome%2F48.0.2564.97+Safari%2F537.36&lang=tr-TR%2Ctr%3Bq%3D0.8%2Cen-US%3Bq%3D0.6%2Cen%3Bq%3D0.4&ref=http%3A%2F%2Fserve.pop.net%2Fs HTTP/1.0 - - - 174.194.36.141 - 0.109-0.009 US /
You can use
awk
to do that :Set some variables to store your requirements
status_code
,ts_at_hour
,ts_before_hour
andts_before_min
(you can define environment vars to them)The regex is a
match
that focus on 4 groups : hour, minutes, seconds defined by([0-9]{2})
and status_code at the end([0-9]{3})
To decompose the regex, you have :
[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+
followed by space\s+
(or more)\[[0-9]{2}\/[a-zA-Z]{3}\/[0-9]{4}:([0-9]{2}):([0-9]{2}):([0-9]{2})\s+[+-][0-9]{4}\]
(notice the 3 groups between()
)([0-9]{3})