Kafka offset management

2019-07-11 19:41发布

We are using Kafka 0.10... I'm seeing some conflicting information online (and in documentation) regarding how offsets are managed in kafka when enable.auto.commit is TRUE. Does the same poll() method that retrieves messages also handle the commits at the configured intervals?

If i retrieve messages from poll in a single threaded application, process the messages to completion (including handling errors) in the SAME thread, meaning poll() will not be invoked again until after my processing is complete, then I presume there is no fear in losing messages, correct? This only works if poll() attempts the commit at the subsequent invocation (if the auto.commit.interval.ms has passed, of course). If the commits are done immediately upon receiving the messages (prior to my app processing the messages), this will not work for us....

This is important, as I want to be certain we won't lose messages if we use the automatic commit policy. Duplicate messages are tolerable for us, we just have no tolerance for lost data.

Thanks for the clarification!

1条回答
干净又极端
2楼-- · 2019-07-11 19:55

Does the same poll() method that retrieves messages also handle the commits at the configured intervals?

Yes. (If enable.auto.commit=true.)

If i retrieve messages from poll in a single threaded application, process the messages to completion (including handling errors) in the SAME thread, meaning poll() will not be invoked again until after my processing is complete, then I presume there is no fear in losing messages, correct?

Yes.

This only works if poll() attempts the commit at the subsequent invocation (if the auto.commit.interval.ms has passed, of course)

This is exactly how it is done.

See here for further details: http://docs.confluent.io/current/clients/consumer.html

查看更多
登录 后发表回答