I am trying to log stack traces into Logstash.
The logging stack is ELK (ElasticSearch, Logstash, Kibana).
The application producing logs is a Java application, using slf4j
as a logging interface, and log4j2
as the logging implementation.
The log4j2.xml
declares this syslog
Appender, with the RFC5424
format:
<Appenders>
<Syslog name="RFC5424" format="RFC5424" host="localhost" port="8514"
protocol="TCP" appName="MyApp" includeMDC="true" mdcId="mdc"
facility="LOCAL0" enterpriseNumber="18060" newLine="true"
messageId="Audit" id="App">
<LoggerFields>
<KeyValuePair key="thread" value="%t"/>
<KeyValuePair key="priority" value="%p"/>
<KeyValuePair key="category" value="%c"/>
<KeyValuePair key="exception" value="%ex{full}"/>
</LoggerFields>
</Syslog>
</Appenders>
I log a Throwable from the Java application like so:
org.slf4j.LoggerFactory.getLogger("exception_test").error("Testing errors", new RuntimeException("Exception message"));
When an exception is logged, Logstash traces something like this to show me what it persists:
{
"@timestamp":"2016-11-08T11:08:10.387Z",
"port":60397,
"@version":"1",
"host":"127.0.0.1",
"message":"<131>1 2016-11-08T11:08:10.386Z MyComputer.local MyApp - Audit [mdc@18060 category=\"exception_test\" exception=\"java.lang.RuntimeException: Exception message",
"type":"syslog",
"tags":[
"_grokparsefailure"
]
}
And I confirm that Kibana displays exactly the same JSON within the _source
field of one of its log entries.
There's a problem here: no stack trace is saved. And the message, "Testing errors", is lost.
The "tags":["_grokparsefailure"]
is unfortunate but not related to this question.
I tried adding <ExceptionPattern/>
to see if it would change anything:
<Syslog name="RFC5424" format="RFC5424" host="localhost" port="8514"
protocol="TCP" appName="MyApp" includeMDC="true" mdcId="mdc"
facility="LOCAL0" enterpriseNumber="18060" newLine="true"
messageId="Audit" id="App">
<LoggerFields>
<KeyValuePair key="thread" value="%t"/>
<KeyValuePair key="priority" value="%p"/>
<KeyValuePair key="category" value="%c"/>
<KeyValuePair key="exception" value="%ex{full}"/>
</LoggerFields>
<ExceptionPattern>%ex{full}</ExceptionPattern>
</Syslog>
<ExceptionPattern/>
replaces the log message, and also (sadly) omits all loggerFields
. But it does give me a class name and line number:
{
"@timestamp":"2016-11-08T11:54:03.835Z",
"port":60397,
"@version":"1",
"host":"127.0.0.1",
"message":"at com.stackoverflow.LogTest.throw(LogTest.java:149)",
"type":"syslog",
"tags":[
"_grokparsefailure"
]
}
Again: no stack trace. And again: the message, "Testing errors", is lost.
How can I use log4j2
to log stack traces into Logstash? I don't necessarily have to use the syslog
appender.
Essentially the constraints are:
- Not be locked in to any particular logging infrastructure (this is why I used syslog)
- Multi-line stack traces need to be understood as being a single log entry. It's undesirable for "each line of the stack trace" to be "a separate log message"
- Stack traces must be able to be subjected to filters. A typical exception of mine can have a page-long stack trace. I want to filter out frames like Spring.
Log4j 2.5's SyslogAppender can only send stack traces over UDP.
<Syslog name="RFC5424" format="RFC5424" host="localhost" port="8514"
protocol="UDP" appName="MyApp" includeMDC="true" mdcId="mdc"
facility="LOCAL0" enterpriseNumber="18060" newLine="true"
messageId="LogTest" id="App">
<LoggerFields>
<KeyValuePair key="thread" value="%t"/>
<KeyValuePair key="priority" value="%p"/>
<KeyValuePair key="category" value="%c"/>
<KeyValuePair key="exception" value="%ex{full}"/>
</LoggerFields>
<ExceptionPattern>%ex{full}</ExceptionPattern>
</Syslog>
With UDP: both ExceptionPattern
and LoggerFields.KeyValuePair["exception"]
start working as solutions for multiline stack traces.
This is what logstash
prints when I sent an exception over UDP via syslog:
{
"@timestamp" => 2016-11-14T13:23:38.304Z,
"@version" => "1",
"host" => "127.0.0.1",
"message" => "<131>1 2016-11-14T13:23:38.302Z BirchBox.local MyApp - LogTest [mdc@18060 category=\"com.stackoverflow.Deeply\" exception=\"java.lang.RuntimeException: Exception message\n\tat com.stackoverflow.Deeply.complain(Deeply.java:10)\n\tat com.stackoverflow.Nested.complain(Nested.java:8)\n\tat com.stackoverflow.Main.main(Main.java:20)\n\" priority=\"ERROR\" thread=\"main\"] Example error\njava.lang.RuntimeException: Exception message\n\tat com.stackoverflow.Deeply.complain(Deeply.java:10)\n\tat com.stackoverflow.Nested.complain(Nested.java:8)\n\tat com.stackoverflow.Main.main(Main.java:20)",
"type" => "syslog",
"tags" => [
[0] "_grokparsefailure"
]
}
Inside [mdc@18060 exception=\"…\"]
we get the LoggerFields.KeyValuePair["exception"]
stack trace.
In addition to this: the stack trace is inserted into the logged message itself, thanks to ExceptionPattern
.
For reference: this is what logstash
prints when I send the exception over TCP via syslog (i.e. the same SyslogAppender as described above, but with protocol="TCP"
instead):
{
"@timestamp" => 2016-11-14T19:56:30.293Z,
"port" => 63179,
"@version" => "1",
"host" => "127.0.0.1",
"message" => "<131>1 2016-11-14T19:56:30.277Z BirchBox.local MyApp - Audit [mdc@18060 category=\"com.stackoverflow.Deeply\" exception=\"java.lang.RuntimeException: Exception message",
"type" => "syslog",
"tags" => [
[0] "_grokparsefailure"
]
}
{
"@timestamp" => 2016-11-14T19:56:30.296Z,
"port" => 63179,
"@version" => "1",
"host" => "127.0.0.1",
"message" => "at com.stackoverflow.Deeply.complain(Deeply.java:10)",
"type" => "syslog",
"tags" => [
[0] "_grokparsefailure"
]
}
{
"@timestamp" => 2016-11-14T19:56:30.296Z,
"port" => 63179,
"@version" => "1",
"host" => "127.0.0.1",
"message" => "at com.stackoverflow.Nested.complain(Nested.java:8)",
"type" => "syslog",
"tags" => [
[0] "_grokparsefailure"
]
}
{
"@timestamp" => 2016-11-14T19:56:30.296Z,
"port" => 63179,
"@version" => "1",
"host" => "127.0.0.1",
"message" => "at com.stackoverflow.Main.main(Main.java:20)",
"type" => "syslog",
"tags" => [
[0] "_grokparsefailure"
]
}
{
"@timestamp" => 2016-11-14T19:56:30.296Z,
"port" => 63179,
"@version" => "1",
"host" => "127.0.0.1",
"message" => "\" priority=\"ERROR\" thread=\"main\"] Example error",
"type" => "syslog",
"tags" => [
[0] "_grokparsefailure"
]
}
{
"@timestamp" => 2016-11-14T19:56:30.296Z,
"port" => 63179,
"@version" => "1",
"host" => "127.0.0.1",
"message" => "java.lang.RuntimeException: Exception message",
"type" => "syslog",
"tags" => [
[0] "_grokparsefailure"
]
}
{
"@timestamp" => 2016-11-14T19:56:30.297Z,
"port" => 63179,
"@version" => "1",
"host" => "127.0.0.1",
"message" => "at com.stackoverflow.Deeply.complain(Deeply.java:10)",
"type" => "syslog",
"tags" => [
[0] "_grokparsefailure"
]
}
{
"@timestamp" => 2016-11-14T19:56:30.298Z,
"port" => 63179,
"@version" => "1",
"host" => "127.0.0.1",
"message" => "at com.stackoverflow.Nested.complain(Nested.java:8)",
"type" => "syslog",
"tags" => [
[0] "_grokparsefailure"
]
}
{
"@timestamp" => 2016-11-14T19:56:30.298Z,
"port" => 63179,
"@version" => "1",
"host" => "127.0.0.1",
"message" => "at com.stackoverflow.Main.main(Main.java:20)",
"type" => "syslog",
"tags" => [
[0] "_grokparsefailure"
]
}
{
"@timestamp" => 2016-11-14T19:56:30.299Z,
"port" => 63179,
"@version" => "1",
"host" => "127.0.0.1",
"message" => "",
"type" => "syslog",
"tags" => [
[0] "_grokparsefailure"
]
}
It looks like TCP does actually "work", but splits the single log message into many syslog messages (for example when \n
is encountered).