Here is example of log:
10.10.10.10 - - [21/Mar/2016:00:00:00 +0000] "GET /example?page=&per_page=100&scopes= HTTP/1.1" 200 769 "-" "" "1.1.1.1"
10.10.10.10 - - [21/Mar/2016:00:00:00 +0000] "GET /example?page=&per_page=500&scopes= HTTP/1.1" 200 769 "-" "" "1.1.1.1"
10.10.10.10 - - [21/Mar/2016:00:00:00 +0000] "GET /example?page=&per_page=100&scopes= HTTP/1.1" 200 769 "-" "" "1.1.1.1"
11.11.11.11 - - [21/Mar/2016:00:00:00 +0000] "GET /example?page=&per_page=10&scopes= HTTP/1.1" 200 769 "-" "" "1.1.1.1"
12.12.12.12 - - [21/Mar/2016:00:00:00 +0000] "GET /example?page=&per_page=500&scopes= HTTP/1.1" 200 769 "-" "" "1.1.1.1"
13.13.13.13 - - [21/Mar/2016:00:00:00 +0000] "GET /example HTTP/1.1" 200 769 "-" "" "1.1.1.1"
With following command
awk --re-interval '/per_page=[0-9]{3}/{cnt[$1]++} END{for (ip in cnt) print ip, cnt[ip]}' file
I can get counted and groupped result of each IPs witch cosist per_page >= 100 in parameters:
12.12.12.12 1
10.10.10.10 3
How I can modify it for output with per_page parameter value? For example (any format):
12.12.12.12 - per_page-500 - 1
10.10.10.10 - per_page-100 - 2
10.10.10.10 - per_page-500 - 1
awk
to the rescue!