i am new to the ELK stack. I have a filebeat service sending logs to logstash, and in logstash using a grok
filter, the data is pushed to an elasticsearch
index.
I am using the gork
filter with match => { "message" => "%{COMBINEDAPACHELOG}"}
to parse the data.
My Issue is, I want the names of the fields and their values to be stored in the elasticsearch index. My different versions of the logs are as below:
27.60.18.21 - - [27/Aug/2017:10:28:49 +0530] "GET /api/v1.2/places/search/json?username=pradeep.pgu&location=28.5359586,77.3677936&query=atm&explain=true&bridge=true HTTP/1.1" 200 3284
27.60.18.21 - - [27/Aug/2017:10:28:49 +0530] "GET /api/v1.2/places/search/json?username=pradeep.pgu&location=28.5359586,77.3677936&query=atms&explain=true&bridge=true HTTP/1.1" 200 1452
27.60.18.21 - - [27/Aug/2017:10:28:52 +0530] "GET /api/v1.2/places/nearby/json?&refLocation=28.5359586,77.3677936&keyword=FINATM HTTP/1.1" 200 3283
27.60.18.21 - - [27/Aug/2017:10:29:06 +0530] "GET /api/v1.2/places/search/json?username=pradeep.pgu&location=28.5359586,77.3677936&query=co&explain=true&bridge=true HTTP/1.1" 200 3415
27.60.18.21 - - [27/Aug/2017:10:29:06 +0530] "GET /api/v1.2/places/search/json?username=pradeep.pgu&location=28.5359586,77.3677936&query=cof&explain=true&bridge HTTP/1.1" 200 2476
The fields that I want in the elastic index are below:
- client_ip => type must be compatible to what kibana uses for IP mapping.
- timestamp => datetime format. => the time the of the log
- method => text => the method that was called e.g. GET,POST
- version => decimal number => e.g. 1.2 / 1.0 (in the sample logs as v1.2)
- username => text => it's the text after the
username=
(in the sample log as pradeep.pgu) - location =>geo_point type => the value has both latitude and longitude so that kibana can plot these on the map.
- search_query => text => the thing that was searched (in the sample from either of the two fields "keyword=" or "query="). Either of the two fields would be present and the one that is present, it's value must be used.
- response_code => number => the code of the response. (in the sample as 200)
- data_transfered => number => the amount of data transferred (the last number in the sample).
Is such a thing even possible? Does the gork filter has a provision for this? The thing is the parameters are not order specific.
Starting from the
HTTPD_COMMONLOG
, you could use this pattern (which you can test at grok tester):Once the grok filter have extracted the request, you can use the kv filter on it, which will extract the parameters (and ignore the problem of the parameters not being order specific). You'll have to put the
field_split
option to &:For
search_query
, depending on which field is present, we use themutate
filter with theadd_field
option to create the field.