Syntax for Lookahead and Lookbehind in Grok Custom

2019-08-29 03:39发布

问题:

I'm trying to use a lookbehind and a lookahead in a Grok custom pattern and getting pattern match errors in the Grok debugger that I cannot resolve.

This is for archiving system logs. I am currently trying to parse the postgrey application.

Given data such as:

2019-04-09T11:41:31-05:00 67.157.192.7 postgrey: action=pass, reason=triplet found, delay=388, client_name=unknown, client_address=103.255.78.9, sender=members@domain.com, recipient=person@domain.com

I'm trying to use the following to pull the string between "action=" and the comma immediately following it as the field "postgrey_action":

%{TIMESTAMP_ISO8601:timestamp} (?:%{SYSLOGFACILITY} )?%{SYSLOGHOST:logsource} %{SYSLOGPROG} (?<postgrey_action>(?<=action=).+?(?=\,))

I expect to see the following output:

{
  "program": "dhcpd:",
  "logsource": "66.146.192.67",
  "timestamp": "2019-04-09T11:41:31-05:00"
  "postgrey_action": "pass"
}

Instead, from the debugger, I receive "Provided Grok patterns do not match data in the input".

How can I properly make this lookbehind/lookahead work?

Edit: I should note that without the postgrey_action match at the end of the Grok pattern, the Grok Debugger runs and works as expected (using linux-syslog and grok-patterns).

Logstash version 6.3.2

回答1:

As a work around, I have resorted to modifying my syntax, using a custom patterns file, and referencing it in each filter using the patterns_dir directive.

Ex. My pattern:

POSTGREY %{TIMESTAMP_ISO8601:timestamp} (?:%{SYSLOGFACILITY} )?%{SYSLOGHOST:logsource} %{SYSLOGPROG} (action=)%{WORD:postgrey_action}(,) (reason=)%{DATA:postgrey_reason}(,) (delay=)%{NUMBER:postgrey_delay}(,) (client_name=)%{IPORHOST}(,) (client_address=)%{IPORHOST:postgrey_clientaddr}(,) (sender=)%{EMAILADDRESS:postgrey_sender}(,)

My filter:

    if "postgrey" in [program]  {
        grok {
        match => { "message" => "%{POSTGREY}"}
        patterns_dir => ["/etc/logstash/patterns"]
        overwrite => [ "message" ]
        }
    }

However, this workaround still does not answer my original question, which is, why did my initial approach not work?

Looking at the Oniguruma Regex documentation and the Grok filters documentation, it's not clear to me what is wrong with my original syntax or how a look-ahead/look-behind should be properly implemented with grok regex named capture. If it is not supported, it should not be documented as such.