Is it safe to reinsert the entry from Guava Remova

2019-04-19 23:29发布

I've got a Guava Cache (or rather, I am migrating from MapMaker to Cache) and the values represent long-running jobs. I'd like to add expireAfterAccess behavior to the cache, as it's the best way to clean it up; however, the job may still be running even though it hasn't been accessed via the cache in some time, and in that case I need to prevent it from being removed from the cache. I have three questions:

  1. Is it safe to reinsert the cache entry that's being removed during the RemovalListener callback?

  2. If so, is it threadsafe, such that there's no possible way the CacheLoader could produce a second value for that key while the RemovalListener callback is still happening in another thread?

  3. Is there a better way to achieve what I want? This isn't strictly/only a "cache" - it's critical that one and only one value is used for each key - but I also want to cache the entry for some time after the job it represents is complete. I was using MapMaker before and the behaviors I need are now deprecated in that class. Regularly pinging the map while the jobs are running is inelegant, and in my case, infeasible. Perhaps the right solution is to have two maps, one without eviction, and one with, and migrate them across as they complete.

I'll make a feature request too - this would solve the problem: allow individual entries to be locked to prevent eviction (and then subsequently unlocked).

[Edit to add some details]: The keys in this map refer to data files. The values are either a running write job, a completed write job, or - if no job is running - a read-only, produced-on-lookup object with information read from the file. It's important that there is exactly zero or one entry for each file. I could use separate maps for the two things, but there would have to be coordination on a per-key basis to make sure only one or the other is in existence at one time. Using a single map makes it simpler, in terms of getting the concurrency correct.

4条回答
迷人小祖宗
2楼-- · 2019-04-20 00:12

To address the first two questions in your post directly, removal listeners are triggered at some arbitrary point in time after elements are removed from the cache, therefore you cannot use removal listeners to atomically re-insert a value.

  1. Is it safe to reinsert the cache entry that's being removed during the RemovalListener callback?

Yes, this is safe; calling .put() from within a RemovalListener will not break the cache in any way. Broadly speaking, Cache has no problem with concurrent mutations.

  1. If so, is it threadsafe, such that there's no possible way the CacheLoader could produce a second value for that key while the RemovalListener callback is still happening in another thread?

No, mutations triggered by a RemovalListener are not atomic with respect to the associated removal. Not only is it possible for another thread to access the cache in-between the removal finishing and the listener being notified, it's possible for there to be a significant delay between these two actions. As the wiki mentions "removal listener operations are executed synchronously [during] cache maintenance". It goes on to say that maintenance tasks are executed in small batches and no guarantees are provided as to when (or implicitly if) they will actually be carried out.

  1. Is there a better way to achieve what I want? This isn't strictly/only a "cache" - it's critical that one and only one value is used for each key - but I also want to cache the entry for some time after the job it represents is complete.

Like you say, this isn't really a cache (or more specifically, an evicting cache). If there are objects you don't want to be evicted, simply don't store them in an evicting cache. Like Louis suggest using a two-data-structure setup with a map for ongoing processes and an evicting cache for finished processes would do exactly what you need.

查看更多
再贱就再见
3楼-- · 2019-04-20 00:17

It occurred to me yesterday as I upgraded to Guava 11 that there is a way to lock entries into the cache: use maximumWeight as the sole removal method, and have your Weigher return a 0 weight for entries that should stay locked into the cache. I haven't verified this by testing, but the documentation for CacheBuilder.weigher() states,

"When the weight of an entry is zero it will not be considered for size-based eviction (though it still may be evicted by other means)."

查看更多
啃猪蹄的小仙女
4楼-- · 2019-04-20 00:18

I am not completely clear on the exact problem but another solution would be to have a Cache with softValues() instead of a maximum size or expiry time. Every time you access the cache value (in your example, start the computation), you should maintain state somewhere else with a strong reference to this value. This will prevent the value from being GCed. Whenever the use of this value drops to zero (in your example, the computation ends and its OK for the value to go away), you could remove all strong references. For example, you could use the AtomicLongMap with the Cache value as the AtomicLongMap key and periodically call removeAllZeros() on the map.

Note that, as the Javadoc states, the use of softValues() does come with tradeoffs.

查看更多
啃猪蹄的小仙女
5楼-- · 2019-04-20 00:22

I looked at the Guava code and found that CustomConcurrentHashMap (which CacheBuilder uses as an underlying implementation) adds removal notifications to an internal queue and processes them later. Therefore, reinserting the entry into the map from the removal callback is technically "safe", but opens a window during which the entry is not in the map, and therefore an alternate entry could be produced by another thread via the CacheLoader.

I solved this problem using a suggestion from @Louis in his comment above. The primary map has no expiration at all, but each time I look something up in that map, I also add an entry to a secondary Cache that does have expireAfterAccess. In the removal listener for that secondary cache, I make some decisions about whether to remove the entry from the primary map. Nothing else uses the secondary cache. This seems to be an elegant solution for conditional eviction, and it's working. (I'm really still using the Guava r09 MapMaker for this solution, but it should work equally well for Guava 11.)

查看更多
登录 后发表回答