I'd like to parse ingress nginx logs using fluentd in Kubernetes. That was quite easy in Logstash, but I'm confused regarding fluentd syntax.
Right now I have the following rules:
<source>
type tail
path /var/log/containers/*.log
pos_file /var/log/es-containers.log.pos
time_format %Y-%m-%dT%H:%M:%S.%NZ
tag kubernetes.*
format json
read_from_head true
keep_time_key true
</source>
<filter kubernetes.**>
type kubernetes_metadata
</filter>
And as a result I get this log but it is unparsed:
127.0.0.1 - [127.0.0.1] - user [27/Sep/2016:18:35:23 +0000] "POST /elasticsearch/_msearch?timeout=0&ignore_unavailable=true&preference=1475000747571 HTTP/2.0" 200 37593 "http://localhost/app/kibana" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Centos Chromium/52.0.2743.116 Chrome/52.0.2743.116 Safari/537.36" 951 0.408 10.64.92.20:5601 37377 0.407 200
I'd like to apply filter rules to be able to search by IP address, HTTP method, etc in Kibana. How can I implement that?
Pipelines are quite different in logstash and fluentd. And it took some time to build working Kubernetes -> Fluentd -> Elasticsearch -> Kibana solution.
Short answer to my question is to install fluent-plugin-parser plugin (I wonder why it doesn't ship within standard package) and put this rule after kubernetes_metadata filter:
Long answer with lots of examples is here: https://github.com/kayrus/elk-kubernetes/
Because, you use json format for parsing. Try this. http://docs.fluentd.org/articles/recipe-nginx-to-elasticsearch
If you use custom format, you might need to write your own regex. http://docs.fluentd.org/articles/in_tail
You can use multi-format-parser plugin, https://github.com/repeatedly/fluent-plugin-multi-format-parser
Note: I'm curious to what was the final conf looks like including the filter parser.