-->

can't parse xml input with logstash filter

2020-07-20 00:24发布

问题:

Hi I'm trying to parse the following xml :

<msg time='2014-08-04T14:36:02.136+03:00' org_id='oracle' comp_id='rdbms'
 msg_id='opistr_real:953:3971575317' type='NOTIFICATION' group='startup'
 level='16' host_id='linux4_l' host_addr='127.0.0.1'
 pid='8986' version='1'>
 <txt>Starting ORACLE instance (normal)
 </txt>
</msg>

using this configuration :

 input {
   stdin {
    type => "stdin-type"
  }
  }
 filter { multiline {
                       pattern => "^\s|</msg>|^[A-Za-z].*"
                        what => "previous"
                }
                xml {
                        store_xml => "false"
                        source => "message"
                        xpath => [
                                "/msg/@client_id", "msg_client_id",
                                "/msg/@host_id", "msg_host_id",
                                "/msg/@host_addr", "msg_host_addr",
                                "/msg/@level", "msg_level",
                                "/msg/@module", "msg_module",
                                "/msg/@msg_id", "msg_msg_id",
                                "/msg/@pid", "msg_pid",
                                "/msg/@org_id", "msg_org_id",
                                "/msg/@time", "msg_time",
                                "/msg/@level", "msg_level",
                                "/msg/txt/text()","msg_txt"
                        ]
               }
                date {
                        match => [ "msg_time", "ISO8601" ]
                }
                mutate {
                        add_tag => "%{type}"
                }
}
output { elasticsearch { host => localhost } stdout { codec => rubydebug } }

but when i run logstash I get the following error :

{:timestamp=>"2014-09-04T17:28:39.428000+0300", :message=>"Exception in filterworker", "exception"=>#<NoMethodError: undefined method `split' for ["msg_level", "msg_level"]:Array>, "backtrace"=>["/opt/logstash/lib/logstash/util/accessors.rb:19:in `parse'", "/opt/logstash/lib/logstash/util/accessors.rb:15:in `get'", "/opt/logstash/lib/logstash/util/accessors.rb:59:in `store_path'", "/opt/logstash/lib/logstash/util/accessors.rb:55:in `lookup'", "/opt/logstash/lib/logstash/util/accessors.rb:34:in `get'", "/opt/logstash/lib/logstash/event.rb:127:in `[]'", "/opt/logstash/lib/logstash/filters/xml.rb:117:in `filter'"

.... "/opt/logstash/lib/logstash/pipeline.rb:143:in `start_filters'"], :level=>:error} {:timestamp=>"2014-09-04T17:30:47.805000+0300", :message=>"Interrupt received. Shutting down the pipeline.", :level=>:warn}

回答1:

I've found my problem I've duplicated parsing on xpath , /msg@level apper twice .



回答2:

The multiline codec is not well suited for this type of file, but you'd use something like:

multiline {
      pattern => '<msg'
      negate => true
      what => previous
}

It has the problem that the last event in the file doesn't go out until the next event comes in (so you end up losing the last event in a file).



回答3:

It has the problem that the last event in the file doesn't go out until the next event comes in (so you end up losing the last event in a file).

It's better to match the closing tag instead.

multiline {
    pattern => "</msg>$"
    negate => true
    what => next
}