I have a remote machine that combines multiline events and sends them across the lumberjack protocol.
What comes in is something that looks like this:
{
"message" => "2014-10-20T20:52:56.133+0000 host 2014-10-20 15:52:56,036 [ERROR ][app.logic ] Failed to turn message into JSON\nTraceback (most recent call last):\n File \"somefile.py", line 249, in _get_values\n return r.json()\n File \"/path/to/env/lib/python3.4/site-packages/requests/models.py\", line 793, in json\n return json.loads(self.text, **kwargs)\n File \"/usr/local/lib/python3.4/json/__init__.py\", line 318, in loads\n return _default_decoder.decode(s)\n File \"/usr/local/lib/python3.4/json/decoder.py\", line 343, in decode\n obj, end = self.raw_decode(s, idx=_w(s, 0).end())\n File \"/usr/local/lib/python3.4/json/decoder.py\", line 361, in raw_decode\n raise ValueError(errmsg(\"Expecting value\", s, err.value)) from None\nValueError: Expecting value: line 1 column 1 (char 0), Failed to turn message into JSON"
}
When I try to match the message with
grok {
match => [ "message", "%{TIMESTAMP_ISO8601:timestamp} \[%LOGLEVEL:loglevel}%{ SPACE}\]\[%{NOTSPACE:module}%{SPACE}\]%{GREEDYDATA:message}" ]
}
the GREEDYDATA
is not nearly as greedy as I would like.
So then I tried to use gsub:
mutate {
gsub => ["message", "\n", "LINE_BREAK"]
}
# Grok goes here
mutate {
gsub => ["message", "LINE_BREAK", "\n"]
}
but that one didn't work rather than
The Quick brown fox
jumps over the lazy
groks
I got
The Quick brown fox\njumps over the lazy\ngroks
So...
How do I either add the newline back to my data, make the GREEDYDATA
match my newlines, or in some other way grab the relevant portion of my message?
Adding the regex flag to the beginning allows for matching newlines:
All
GREEDYDATA
is is.*
, but.
doesn't match newline, so you can replace%{GREEDYDATA:message}
with(?<message>(.|\r|\n)*)
and get it to be truly greedy.