Include monotonically increasing value in logstash

2019-02-27 16:28发布

问题:

I know there's no built in "line count" functionality while processing files through logstash (for various, understandable and documented reasons). But - there should be a mechanism, within any given logstash instance - to have an monotonically increasing variable / count for every parsed line.

I don't want to go the metrics route since it's a continuous polling mechanism (every n-seconds). Alternatives include pre-processing of log files which given my particular use case - is unacceptable.

Again, let me reiterate - I need the ability to generate/read a monotonically increasing variable that I can store during in a logstash filter.

Thoughts?

回答1:

here's nothing built into logstash to do it.

You can build a filter to do it pretty easily

Just drop something like this into lib/logstash/filters/seq.rb

# encoding: utf-8
require "logstash/filters/base"
require "logstash/namespace"
require "set"
#
# This filter will adds a sequence number to a log entry
#
# The config looks like this:
#
#     filter {
#       seq {
#         field => "seq"
#       }
#     }
#
# The `field` is the field you want added to the event.
class LogStash::Filters::Seq < LogStash::Filters::Base

  config_name "seq"
  milestone 1

  config :field, :validate => :string, :required => false, :default => "seq"

  public
  def register
    # Nothing
  end # def register

  public
  def initialize(config = {})
    super

    @threadsafe = false

    # This filter needs to keep state.
    @seq=1
  end # def initialize

  public
  def filter(event)
    return unless filter?(event)
    event[@field] = @seq
    @seq = @seq + 1
    filter_matched(event)
  end # def filter
end # class LogStash::Filters::Seq

This will start at 1 every time Logstash is restarted, but for most situations, this would be ok. If you need something that is persistent across restarts, you need to do a bit more work to persist it somewhere



回答2:

For anyone finding this in 2018+: logstash now has a ruby filter that makes this much simpler. Put the following in a file somewhere:

# encoding: utf-8

def register(params)
    @seq = 1
end

def filter(event)
    event.set("seq", @seq)
    @seq += 1
    return [event]
end

And then configure it like this in your logstash.conf (substitute in the filename you used):

ruby {
  path => "/usr/local/lib/logstash/seq.rb"
}

It would be pretty easy to make the field name configurable from logstash.conf, but I'll leave that as an exercise for the reader.

I suspect this isn't thread-safe, so I'm running only a single logstash worker.



回答3:

this is another choice to slove the problem,this work for me,thanks to the answer from the previous person about thread safe. i use seq field to sort my desc

this is my configure

logstash.conf

filter {
  ruby {
    code => 'event.set("seq", Time.now.strftime("%N").to_i)'           
        }
}

logstash.yml

pipeline.batch.size: 200
pipeline.batch.delay: 60
pipeline.workers: 1
pipeline.output.workers: 1


标签: logstash