I have a Kafka cluster which receives messages from a source based on data changes in that source. In some cases the messages are meant to be processed in the future. So I have 2 options:
- Consume all messages and post messages that are meant for the future back to Kafka under a different topic (with the date in the topic name) and have a Storm topology that looks for topics with that date's name in it. This will ensure that messages are processed only on the day it's meant for.
- Store it in a separate DB and build a scheduler that reads messages and posts to Kafka only on that future date.
Option 1 is easier to execute but my question is: Is Kafka a durable data store? And has anyone done this sort of eventing with Kafka? Are there any gaping holes in the design?