I've got a Guava Cache
(or rather, I am migrating from MapMaker
to Cache
) and the values represent long-running jobs. I'd like to add expireAfterAccess
behavior to the cache, as it's the best way to clean it up; however, the job may still be running even though it hasn't been accessed via the cache in some time, and in that case I need to prevent it from being removed from the cache. I have three questions:
Is it safe to reinsert the cache entry that's being removed during the
RemovalListener
callback?If so, is it threadsafe, such that there's no possible way the
CacheLoader
could produce a second value for that key while theRemovalListener
callback is still happening in another thread?Is there a better way to achieve what I want? This isn't strictly/only a "cache" - it's critical that one and only one value is used for each key - but I also want to cache the entry for some time after the job it represents is complete. I was using
MapMaker
before and the behaviors I need are now deprecated in that class. Regularly pinging the map while the jobs are running is inelegant, and in my case, infeasible. Perhaps the right solution is to have two maps, one without eviction, and one with, and migrate them across as they complete.
I'll make a feature request too - this would solve the problem: allow individual entries to be locked to prevent eviction (and then subsequently unlocked).
[Edit to add some details]: The keys in this map refer to data files. The values are either a running write job, a completed write job, or - if no job is running - a read-only, produced-on-lookup object with information read from the file. It's important that there is exactly zero or one entry for each file. I could use separate maps for the two things, but there would have to be coordination on a per-key basis to make sure only one or the other is in existence at one time. Using a single map makes it simpler, in terms of getting the concurrency correct.
To address the first two questions in your post directly, removal listeners are triggered at some arbitrary point in time after elements are removed from the cache, therefore you cannot use removal listeners to atomically re-insert a value.
Yes, this is safe; calling
.put()
from within aRemovalListener
will not break the cache in any way. Broadly speaking,Cache
has no problem with concurrent mutations.No, mutations triggered by a
RemovalListener
are not atomic with respect to the associated removal. Not only is it possible for another thread to access the cache in-between the removal finishing and the listener being notified, it's possible for there to be a significant delay between these two actions. As the wiki mentions "removal listener operations are executed synchronously [during] cache maintenance". It goes on to say that maintenance tasks are executed in small batches and no guarantees are provided as to when (or implicitly if) they will actually be carried out.Like you say, this isn't really a cache (or more specifically, an evicting cache). If there are objects you don't want to be evicted, simply don't store them in an evicting cache. Like Louis suggest using a two-data-structure setup with a map for ongoing processes and an evicting cache for finished processes would do exactly what you need.
It occurred to me yesterday as I upgraded to Guava 11 that there is a way to lock entries into the cache: use
maximumWeight
as the sole removal method, and have your Weigher return a 0 weight for entries that should stay locked into the cache. I haven't verified this by testing, but the documentation for CacheBuilder.weigher() states,I am not completely clear on the exact problem but another solution would be to have a Cache with
softValues()
instead of a maximum size or expiry time. Every time you access the cache value (in your example, start the computation), you should maintain state somewhere else with a strong reference to this value. This will prevent the value from being GCed. Whenever the use of this value drops to zero (in your example, the computation ends and its OK for the value to go away), you could remove all strong references. For example, you could use the AtomicLongMap with the Cache value as the AtomicLongMap key and periodically callremoveAllZeros()
on the map.Note that, as the Javadoc states, the use of
softValues()
does come with tradeoffs.I looked at the Guava code and found that
CustomConcurrentHashMap
(whichCacheBuilder
uses as an underlying implementation) adds removal notifications to an internal queue and processes them later. Therefore, reinserting the entry into the map from the removal callback is technically "safe", but opens a window during which the entry is not in the map, and therefore an alternate entry could be produced by another thread via theCacheLoader
.I solved this problem using a suggestion from @Louis in his comment above. The primary map has no expiration at all, but each time I look something up in that map, I also add an entry to a secondary Cache that does have
expireAfterAccess
. In the removal listener for that secondary cache, I make some decisions about whether to remove the entry from the primary map. Nothing else uses the secondary cache. This seems to be an elegant solution for conditional eviction, and it's working. (I'm really still using the Guava r09 MapMaker for this solution, but it should work equally well for Guava 11.)