How to make Logstash multiline filter merge lines

2020-04-27 04:34发布

问题:

I am new to logstash and desparate to setup ELK for one of the usecase. I have found this question relevent to mine Why won't Logstash multiline merge lines based on grok'd field? If multiline filter do not merge lines on grok fields then how do I merge line 2 and 10 from the below log sample? Please help.

Using grok patterns I have created a field 'id' which holds the value 715.

Line1 - 5/08/06 00:10:35.348 [BaseAsyncApi] [qtp19303632-51]: INFO: [714] CMDC flowcxt=[55c2a5fbe4b0201c2be31e35] method=contentdetail uri=http://10.126.44.161:5600/cmdc/content/programid%3A%2F%2F317977349~programid%3A%2F%2F9?lang=eng&catalogueId=30&region=3000~3001&pset=pset_pps header={}   
Line2 - 2015/08/06 00:10:35.348 [BaseAsyncApi] [qtp19303632-53]: INFO: [715] CMDC flowcxt=[55c2a5fbe4b0201c2be31e36] method=contentdetail uri=http://10.126.44.161:5600/cmdc/content/programid%3A%2F%2F1640233758~programid%3A%2F%2F1073741829?lang=eng&catalogueId=30&region=3000~3001&pset=pset_pps header={}   
Line3 - 2015/08/06 00:10:35.349 [TWCAsyncProcessor] [TWC-pool-3-thread-2]: INFO: [714:426] TWC request=MercurySortRequest   
Line4 - 2015/08/06 00:10:35.349 [TWCAsyncProcessor] [TWC-pool-3-thread-1]: INFO: [715:427] TWC request=MercurySortRequest   
Line5 - 2015/08/06 00:10:35.352 [BaseAsyncApi] [qtp19303632-54]: INFO: [716] CMDC flowcxt=[55c2a5fbe4b0201c2be31e37] method=contentdetail uri=http://10.126.44.161:5600/cmdc/content/programid%3A%2F%2F2144942810~programid%3A%2F%2F1953281601?lang=eng&catalogueId=30&region=3000~3001&pset=pset_pps header={}   
Line6 - 2015/08/06 00:10:35.354 [TWCAsyncProcessor] [TWC-pool-3-thread-1]: INFO: [716:428] TWC request=MercurySortRequest   
Line7 - 2015/08/06 00:10:35.359 [BaseAsyncApi] [qtp19303632-49]: INFO: [717] CMDC flowcxt=[55c2a5fbe4b0201c2be31e38] method=contentdetail uri=http://10.126.44.161:5600/cmdc/content/programid%3A%2F%2F2144942448~programid%3A%2F%2F2147355770?lang=eng&catalogueId=30&region=3000~3001&pset=pset_pps header={}   
Line8 - 2015/08/06 00:10:35.360 [TWCAsyncProcessor] [TWC-pool-3-thread-2]: INFO: [717:429] TWC request=MercurySortRequest   
Line9 - 2015/08/06 00:10:35.366 [TWCAsyncProcessor$TWCAsyncProcessorCallback$ReceiveCallback] [CMDC-pool-2-thread-41]: INFO: [715:427] TWC response status=200 hits=1 time=17 internal=10.42   
Line10 - 2015/08/06 00:10:35.367 [BaseAsyncApi] [CMDC-pool-2-thread-41]: INFO: [715] CMDC response status=200 CMDC=19ms TWC=17ms #TWC=1

回答1:

You need to use a multiline filter with stream_identity set. The documentation here isn't clear on what it's used for, but your basic strategy would be something like this:

if (!"multiline" in [tags]) {
  grok { // parse out your identity field }
  multiline { 
    stream_identity => "%{id}"
    pattern => "." // match anything because we're gathering by id field
    what => "previous"
    periodic_flush => true
    max_age => 5 // however many seconds it takes to get all of your lines together
    add_tags => ["multiline" ]
  }
} else {
  // process multiline event that's been flushed
}

I haven't tried anything like this since 1.5 came out, but the docs say it should work (in 1.4.2 and prior, the flushing mechanism didn't work, so you could lose events).