kafka retention policy didn't work as expected

2019-07-27 12:38发布

问题:

I wanted a certain kafka topic to only keep 1 day of data. But it didn't seem to delete any data at all if we keep sending data to the topic(active). I tried topic side parameter (retention.ms) and server side:

    log.retention.hours=1 or log.retention.ms= 86400000 
    cleanup.policy=delete

But it didn't seem to work for alive topics, if we keep sending data to it. Only when we stop sending data to the topic, it will follow the retention policy.

So, what's the right config for a active topic, to retain data only for some time?

回答1:

Log retention is based on the creation date of the log file. Try setting your log.roll.hours < 24 (since by default it is 24 * 7).

For 0.8

If you only want to control log file creation per topic, set log.roll.hours.per.topic in the topic config.

for 1.0

Logs are segmented and the per topic config for log segments is:

segment.ms Note: this is in millseconds, and overrides the server-wide setting of log.roll.ms.

See also: Purge Kafka Topic



回答2:

Kafka deletes only the passive log segments. You have to tune either log.segment.bytes or log.roll.ms to roll the active log segment into passive one. Refer the Broker configuration for more information.