I am currently learning some Kafka best practices from netflix (https://www.slideshare.net/wangxia5/netflix-kafka). It is a very good slide. However, I really dont understand one of the slides (slide 18) mentioned about producer resilience configuration, I hope someone in stackoverflow is very kind to give me insight for that (Cant find the video or reach out the author...).
The slide mentioned: Fail but never block in producer resilience configuration.
Block.on.buffer.full=false
Even thought this is the deprecated configuration, I guess the idea is to let producer fail right away rather than block to wait. In the latest kafka configuration, I can use a small value for block.max.ms
to fail the producer to sends message rather than blocking it.
Question 1: Why we want to fail it right away, does it means retry later on rather than block it ?
Handle Potential Block for first meta data request
Question 2: I can understand the meta data in the consumer side. i.e registering consumer group and sort of stuff, but what is meta data request for producer point of view ? and is it potentially blocked ? Is there any kafka documentation to describe that
Periodically check whether Kafka producer was open successfully
Question 3: Is there a way we can check that and what benefits for that check ?
Thanks in advance :)
You have to keep in mind how a kafka producer works:
From the API-Documentation:
If you call the
send
method to send a record to the broker, this message will be added to an internal buffer (the size of this buffer can be configured using thebuffer.memory
configuration property). Now different things can happen:max.block.ms
(as an replacement forblock.on.buffer.full
) to a positive value the send message will block for this amount of time(1) and through a timeout exception afterwards.Regarding your questions: (1) If I got the slides right, Netflix explicitly wants to throw away the messages which they can't send to the broker (instead of blocking, retrying, failing ...). This of course highly depends on your application and the kind of messages you are dealing with. If it "just log messages" it might be no big deal. If it comes to financial transactions you may want to
(2) The producer needs some metadata about the cluster. E.g. it needs to know which key goes to which partition. There is a good blogpost by hortonworks how the producer works internaly. I think it is worth reading: https://community.hortonworks.com/articles/72429/how-kafka-producer-work-internally.html
Furthermore the statement:
(3) Connections to the producers are closed by the broker if the producer is idle for some time (see
connections.max.idle.ms
). I am not aware of some standard way to keep the connection of your consumer alive or even to check if the connection is still alive. What you could do is peridicaly send a metadatarequest to the broker (producer.partitionsFor(anyTopic)
). But again maybe this is not an issue for your application.(1) When it comes to details what is taken into account to calculate the time passed it get's a bit tricky. For
max.block.ms
it is actually: