Logstash date parsing as timestamp using the date

2019-02-02 09:24发布

问题:

Well, after looking around quite a lot, I could not find a solution to my problem, as it "should" work, but obviously doesn't. I'm using on a Ubuntu 14.04 LTS machine Logstash 1.4.2-1-2-2c0f5a1, and I am receiving messages such as the following one:

2014-08-05 10:21:13,618 [17] INFO  Class.Type - This is a log message from the class:
  BTW, I am also multiline

In the input configuration, I do have a multiline codec and the event is parsed correctly. I also separate the event text in several parts so that it is easier to read.

In the end, I obtain, as seen in Kibana, something like the following (JSON view):

{
  "_index": "logstash-2014.08.06",
  "_type": "customType",
  "_id": "PRtj-EiUTZK3HWAm5RiMwA",
  "_score": null,
  "_source": {
    "@timestamp": "2014-08-06T08:51:21.160Z",
    "@version": "1",
    "tags": [
      "multiline"
    ],
    "type": "utg-su",
    "host": "ubuntu-14",
    "path": "/mnt/folder/thisIsTheLogFile.log",
    "logTimestamp": "2014-08-05;10:21:13.618",
    "logThreadId": "17",
    "logLevel": "INFO",
    "logMessage": "Class.Type - This is a log message from the class:\r\n  BTW, I am also multiline\r"
  },
  "sort": [
    "21",
    1407315081160
  ]
}

You may have noticed that I put a ";" in the timestamp. The reason is that I want to be able to sort the logs using the timestamp string, and apparently logstash is not that good at that (e.g.: http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/multi-fields.html).

I have unsuccessfull tried to use the date filter in multiple ways, and it apparently did not work.

date {
            locale => "en"
            match => ["logTimestamp", "YYYY-MM-dd;HH:mm:ss.SSS", "ISO8601"]
            timezone => "Europe/Vienna"
            target => "@timestamp"
            add_field => { "debug" => "timestampMatched"}
        }

Since I read that the Joda library may have problems if the string is not strictly ISO 8601-compliant (very picky and expects a T, see https://logstash.jira.com/browse/LOGSTASH-180), I also tried to use mutate to convert the string to something like 2014-08-05T10:21:13.618 and then use "YYYY-MM-dd'T'HH:mm:ss.SSS". That also did not work.

I do not want to have to manually put a +02:00 on the time because that would give problems with daylight saving.

In any of these cases, the event goes to elasticsearch, but date does apparently nothing, as @timestamp and logTimestamp are different and no debug field is added.

Any idea how I could make the logTime strings properly sortable? I focused on converting them to a proper timestamp, but any other solution would also be welcome.

As you can see below:

When sorting over @timestamp, elasticsearch can do it properly, but since this is not the "real" log timestamp, but rather when the logstash event was read, I need (obviously) to be able to sort also over logTimestamp. This is what then is output. Obviously not that useful:

Any help is welcome! Just let me know if I forgot some information that may be useful.

Update:

Here is the filter config file that finally worked:

# Filters messages like this:
# 2014-08-05 10:21:13,618 [17] INFO  Class.Type - This is a log message from the class:
#  BTW, I am also multiline

# Take only type- events (type-componentA, type-componentB, etc)
filter {
    # You cannot write an "if" outside of the filter!
    if "type-" in [type] {
        grok {
            # Parse timestamp data. We need the "(?m)" so that grok (Oniguruma internally) correctly parses multi-line events
            patterns_dir => "./patterns"
            match => [ "message", "(?m)%{TIMESTAMP_ISO8601:logTimestampString}[ ;]\[%{DATA:logThreadId}\][ ;]%{LOGLEVEL:logLevel}[ ;]*%{GREEDYDATA:logMessage}" ]
        }

        # The timestamp may have commas instead of dots. Convert so as to store everything in the same way
        mutate {
            gsub => [
                # replace all commas with dots
                "logTimestampString", ",", "."
                ]
        }

        mutate {
            gsub => [
                # make the logTimestamp sortable. With a space, it is not! This does not work that well, in the end
                # but somehow apparently makes things easier for the date filter
                "logTimestampString", " ", ";"
                ]
        }

        date {
            locale => "en"
            match => ["logTimestampString", "YYYY-MM-dd;HH:mm:ss.SSS"]
            timezone => "Europe/Vienna"
            target => "logTimestamp"
        }
    }
}

filter {
    if "type-" in [type] {
        # Remove already-parsed data
        mutate {
            remove_field => [ "message" ]
        }
    }
}

回答1:

I have tested your date filter. it works on me!

Here is my configuration

input {
    stdin{}
}

filter {
    date {
        locale => "en"
        match => ["message", "YYYY-MM-dd;HH:mm:ss.SSS"]
        timezone => "Europe/Vienna"
        target => "@timestamp"
        add_field => { "debug" => "timestampMatched"}
   }
}

output {
    stdout {
            codec => "rubydebug"
    }
}

And I use this input:

2014-08-01;11:00:22.123

The output is:

{
   "message" => "2014-08-01;11:00:22.123",
  "@version" => "1",
"@timestamp" => "2014-08-01T09:00:22.123Z",
      "host" => "ABCDE",
     "debug" => "timestampMatched"
}

So, please make sure that your logTimestamp has the correct value. It is probably other problem. Or can you provide your log event and logstash configuration for more discussion. Thank you.



回答2:

This worked for me - with a slightly different datetime format:

# 2017-11-22 13:00:01,621 INFO [AtlassianEvent::0-BAM::EVENTS:pool-2-thread-2] [BuildQueueManagerImpl] Sent ExecutableQueueUpdate: addToQueue, agents known to be affected: []
input {
   file {
       path => "/data/atlassian-bamboo.log"
       start_position => "beginning"
       type => "logs"      
       codec => multiline {
                pattern => "^%{TIMESTAMP_ISO8601} "
                charset => "ISO-8859-1"
                negate => true
                what => "previous"                
       }       
   }
}
filter {
   grok {
      match => [ "message", "(?m)^%{TIMESTAMP_ISO8601:logtime}%{SPACE}%{LOGLEVEL:loglevel}%{SPACE}\[%{DATA:thread_id}\]%{SPACE}\[%{WORD:classname}\]%{SPACE}%{GREEDYDATA:logmessage}" ]
   }

    date {
        match => ["logtime", "yyyy-MM-dd HH:mm:ss,SSS", "yyyy-MM-dd HH:mm:ss,SSS Z", "MMM dd, yyyy HH:mm:ss a" ]
        timezone => "Europe/Berlin"
   }   
}

output {
  elasticsearch { hosts => ["localhost:9200"] }
  stdout { codec => rubydebug }
}