I am looking for a way of collecting metrics data from multiple devices. The data should be aggregated by multiple "group by" like functions. The aggregation functions list is not complete and new aggregations will be added later and it will be required to aggregate all data collected from first days.
Is it fine to create Kafka topic with 100 year expiration period and use it as a datastore for this purpose? So new aggregations will be able to read from topic's start while existing aggregations will continue from their's offsets?
Yes if you want to keep the data you can just increase the retention time to a large value.
I'd still recommend having a retention policy on size to ensure you don't run out of disk space
In principle, yes you can use Kafka for long-term storage, exactly for the reason you outline - reprocessing of source data to derive additional aggregates/calculations.
A couple of references: