I am currently implementing cache. I have completed basic implementation, like below. What I want to do is to run a thread that will remove entry that satisfy certain conditions.
class Cache {
int timeLimit = 10; //how long each entry needs to be kept after accessed(marked)
int maxEntries = 10; //maximum number of Entries
HashSet<String> set = new HashSet<String>();
public void add(Entry t){
....
}
public Entry access(String key){
//mark Entry that it has been used
//Since it has been marked, background thread should remove this entry after timeLimit seconds.
return set.get(key);
}
....
}
My question is, how should I implement background thread so that the thread will go around the entries in set and remove the ones that has been marked && (last access time - now)>timeLimit
?
edit
Above is just simplified version of codes, that I did not write synchronized statements.
I don't imagine you really need a background thread. Instead you can just remove expired entries before or after you perform a lookup. This simplifies the entire implementation and its very hard to tell the difference.
BTW: If you use a LinkedHashMap, you can use it as a LRU cache by overriding removeEldestEntry (see its javadocs for an example)
First of all, your presented code is incomplete because there is no
get(key)
onHashSet
(so I assume you mean some kind ofMap
instead) and your code does not mention any "marking." There are also many ways to do caching, and it is difficult to pick out the best solution without knowing what you are trying to cache and why.When implementing a cache, it is usually assumed that the data-structure will be accessed concurrently by multiple threads. So the first thing you will need to do, is to make use of a backing data-structure that is thread-safe.
HashMap
is not thread-safe, butConcurrentHashMap
is. There are also a number of other concurrentMap
implementations out there, namely in Guava, Javolution and high-scale lib. There are other ways to build caches besides maps, and their usefulness depends on your use case. Regardless, you will most likely need to make the backing data-structure thread-safe, even if you decide you don't need the background thread and instead evict expired objects upon attempting to retrieve them from the cache. Or letting the GC remove the entries by usingSoftReference
s.Once you have made the internals of your cache thread-safe, you can simply fire up a new (most likely daemonized) thread that periodically sweeps/iterates the cache and removes old entries. The thread would do this in a loop (until interrupted, if you want to be able to stop it again) and then sleep for some amount of time after each sweep.
However, you should consider whether it is worth it for you, to build your own cache implementation. Writing thread-safe code is not easy, and I recommend that you study it before endeavouring to write your own cache implementation. I can recommend the book Java Concurrency in Practice.
The easier way to go about this is, of course, to use an existing cache implementation. There are many options available in Java-land, all with their own unique set of trade-offs.
ConcurrentMap
based API.Since you want to limit the number of entries in the cache, you might be interested in an object-pool instead of a cache.
I'd use Guava's Cache type for this, personally. It's already thread-safe and has methods built in for eviction from the cache based on some time limit. If you want a thread to periodically sweep it, you can just do something like this:
Why are you reinventing the wheel? EhCache (and any decent cache implementation) will do this for you. Also much more lightweight
MapMaker
Cache
from Guava can automatically remove old entries.If you really want to implement this yourself, it is not really that simple.
Remember about synchronization. You should use
ConcurrentHashMap
orsynchronized
keyword to store entries. This might be really tricky.You must store last access time somehow of each entry somehow. Every time you access an entry, you must update that timestamp.
Think about eviction policy. If there are more than
maxEntries
in your cache, which ones to remove first?Do you really need a background thread?
This is surprising, but EhCache (enterprise ready and proven) does not use background thread to invalidate old entries). Instead it waits until the map is full and removes entries lazily. This looks like a good trade-off as threads are expensive.
If you have a background thread, should there be one per cache or one global? Do you start a new thread while creating a new cache or have a global list of all caches? This is harder than you think...
Once you answer all these questions, the implementation is fairly simple: go through all the entries every second or so and if the condition you've already written is met, remove the entry.
First, make access to your collection either
synchronized
or useaConcurrentHashSet
ConcurrentHashMap
basedSet
as indicated in the comments below.Second, write your new thread, and implement it as an endless loop that periodically iterates the prior collection and removes the elements. You should write this class in a way that it is initialized with the correct collection in the constructor, so that you do not have to worry about "how do I access the proper collection".