Logstash : get URL params into hash

2019-07-12 12:01发布

I'm trying to use Logstash and ElasticSearch to monitor my Apache webserver activity. At this time, it works pretty well but I need to more specific informations about my request field. At this time my logstash configuration is :

filter {
  grok { match => { "message" => "%{COMBINEDAPACHELOG}" } }
  grok { match => { "request" => [ "url", "%{URIPATH:url_path}%{URIPARAM:url_params}?" ]} }
   urldecode{ field => "url_path" }
   mutate { gsub =>  ["url_params","\?","" ] }
   kv {
     field_split => "&"
     source => "url_params"
     prefix => "url_param_"
   }
   date { match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ] }
   geoip { source => "clientip" }
   useragent { source => "agent" }
 }

Taking a basic apache log :

255.254.230.10 - - [11/Dec/2013:00:01:45 -0800] "GET /xampp/boreal%3A123456/status.php?pretty=true&test=boreal%3A12345 HTTP/1.1" 200 3891 "http://cadenza/xampp/navi.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0"

The result of this first configuration is :

{
         "message" => "255.254.230.10 - - [11/Dec/2013:00:01:45 -0800] \"GET /xampp/boreal%3A123456/status.php?pretty=true&test=boreal:%3A12345 HTTP/1.1\" 200 3891 \"http://cadenza/xampp/navi.php\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0\"",
        "@version" => "1",
      "@timestamp" => "2013-12-11T08:01:45.000Z",
            ...
         "request" => "/xampp/boreal%3A123456/status.php?pretty=true&test=boreal%3A12345",
        "url_path" => "/xampp/boreal:123456/status.php",
      "url_params" => "pretty=true&test=boreal%3A12345",
"url_param_pretty" => "true",
  "url_param_test" => "boreal%3A12345",
           ...    
}

And (in a dream world), I would like to have this response for url params :

{
         ...
         "request" => "/xampp/boreal%3A123456/status.php?pretty=true&test=boreal%3A12345",
        "url_path" => "/xampp/boreal:123456/status.php",
      "url_params" => {
                "pretty" => "true",
        "url_param_test" => "boreal:12345"
      },
           ...    
}

My whishes

  • url_params become a hash array.
  • each key of this hash will be the name of the param
  • each corresponding value will be the urldecode value

Questions

  • Does I need to create my own plugin (I'm not yet familiar with ruby) ?
  • Is it exist a existing plugin (I didn't found ... maybe bad search) ?
  • Is it a way to do that without a plugin ?

Thanks for your help (and sorry for my english)

Renaud

Solution :

Thanks to Val, He found the solution. I changed my configuration to :

grok { match => { "request" => [ "url", "%{URIPATH:url_path}%{URIPARAM:url_params}?" ]} }
urldecode{ field => "url_path" }
mutate { gsub =>  ["url_params","\?","" ] }
kv {
  field_split => "&"
  source => "url_params"
  target => "url_params_hash"
}
urldecode{ field => "url_params_hash" }

Using this solution, even if an "&"(%26) character are into url_params string the splitting is correct.

1条回答
男人必须洒脱
2楼-- · 2019-07-12 12:37

You're almost doing it right using the kv filter. You need to change its configuration a little bit.

You also need to add another urldecode filter for the url_params just after the other one for the path

urldecode{ field => "url_path" }
urldecode{ field => "url_params" }
mutate { gsub =>  ["url_params","\?","" ] }
kv {
  field_split => "&"
  source => "url_params"
  target => "url_params_hash"
}

You'll get something like this:

{
        "message" => "255.254.230.10 - - [11/Dec/2013:00:01:45 -0800] \"GET /xampp/boreal%3A123456/status.php?pretty=true&test=boreal:%3A12345 HTTP/1.1\" 200 3891 \"http://cadenza/xampp/navi.php\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0\"",
       "@version" => "1",
     "@timestamp" => "2013-12-11T08:01:45.000Z",
"url_params_hash" => {
         "pretty" => "true",
           "test" => "boreal:12345"
     }
}
查看更多
登录 后发表回答