How clean old segments from compacted log in Kafka

2019-08-13 07:32发布

I know that in new Kafka versions we have new retention policy option - compaction of log which delete old version of messages with same keys. But after long time we will get too many compacted log segments with old messages. How can we clean this compacted log automatically?

UDPATE:

I should clarify that we need compact log and way to clean up old messages this in those time. I found discussion for same problem here http://grokbase.com/t/kafka/users/14bv6gaz0t/kafka-0-8-2-log-cleaner but not found how we can manually issue thomstone markers for message and have not any idea this.

2条回答
迷人小祖宗
2楼-- · 2019-08-13 07:52

The only other way to lower the size of your Kafka Log would be through the log retention configuration settings.

log.retention.{ms,minutes,hours}
log.retention.bytes

Also note that if both log.retention.hours and log.retention.bytes are both set we delete a segment when either limit is exceeded.

Those two dictate when logs are deleted in Kafka. The log.retention.bytes defaults to -1, and I'm pretty sure leaving it to -1 allows only the time config to solely determine when a log gets deleted.

Log retention and compaction work separately from each other. Using retention, logs can be deleted after a certain time or size even with log compaction on. So if you theoretically have a 100Mb log limit, and set the your log.retention.bytes=104857600 (100Mb). Kafka will compact your log until it reaches 100Mb in size, and then will delete the necessary messages (oldest first) until the log is under 100Mb in size.

EDIT:

It turns out that log retention and compaction are mutually exclusive based on this link provided by mechanikos. Though it seems odd that Kafka is designed so a log will grow indefinitely with no capability of ever deleting old log messages.

查看更多
做个烂人
3楼-- · 2019-08-13 07:56

This question is quite old, but I thought I'd give the latest update on the matter. There is a feature (https://issues.apache.org/jira/browse/KAFKA-4015) which is already resolved and is scheduled for the 0.10.1.0 release.

查看更多
登录 后发表回答