I tried looking at cache mechanisms, such as Guava's Cache
. Their expiration is only since last update.
What I'm looking for is a data structure that stores keys and cleans the keys after a time has passed since the first insertion. I'm planning for the value to be some counter.
A scenario might be a silent worker that does some work for the first time but keeps silent for an expiry period of time - even if work is requested. If work is requested after the expiry time has passed, he'll do the work.
Know of such a data structure? Thanks.
Created a data structure. Called it
DuplicateActionFilterByInsertTime
.The correct notion is filtering duplicate messages. The following class filters from insert time for some period (
filterMillis
).Implementation:
Test example: (More tests here)
Source.
There are a few options for this.
Passive Removal
If it is not a requirement to clean up the expired keys as soon as they expire or at set intervals (i.e. a key does not need to be removed when the key expires or at some set interval), then PassiveExpiringMap from Apache Commons Collections is a good option. When attempting to access a key in this map, the Time to Live (TTL) of the key (all keys have the same TTL) is checked and if the key has expired, it is removed from the map and
null
is returned. This data structure does not have an active clean-up mechanism, so expired entries are only removed when they are accessed after the TTL corresponding to the key has expired.Cache
If more cache-based functionality is needed (such as maximum cache capacity and add/remove listening), Google Guava provides the CacheBuilder class. This class is more complex than the Apache Commons alternative, but it also provides much more functionality. The trade-off may be worth it if this intended for more of a cache-based application.
Threaded Removal
If active removal of expired keys is needed, a separate thread can be spawn that is responsible for removing expired keys. Before looking at a possible implementation structure, it should be noted that this approach may be less performant than the above alternatives. Besides the overhead of starting a thread, the thread may cause contention with clients accessing the map. For example, if a client wants to access a key and the clean-up thread is currently removing expired keys, the client will either block (if synchronization is used) or have a different view of the map (which key-value pairs are contained in the map) if some concurrent mechanism is employed.
With that said, using this approach is complicated because it requires that the TTL be stored with the key. One approach is to create an
ExpiringKey
, such as (each key can have its own TTL; even if all of the keys will end up having the same TTL value, this technique removes the need to create aMap
decorator or some other implementation of theMap
interface):Now the type of the map would be
Map<ExpiringKey<K>, V>
with some specificK
andV
type values. The background thread can be represented using aRunnable
that resembles the following:Then the
Runnable
can be started so that it executes at a fixed interval using aScheduledExecutorService
as follows (which will clean up the map every 5 seconds):It is important to note that the
Map
implementation used formyMap
must be synchronized or allow for concurrent access. The challenge with a concurrentMap
implementation is that theExpiredKeyRemover
may see a different view of the map than a client and an expired key may be returned to the client if the clean-up thread is not completed removing other keys (even if it has removed the desired/expired key since its changes may not have been committed yet). Additionally, the above key-removal code can be implemented using streams, but the above code has been used just to illustrate the logic rather than provide a performant implementation.Hope that helps.