My use case is to maintain an in-memory cache over the data stored in a persistent DB.
I use the data to populate a list/map of entries on the UI. At any given time, the data displayed on the UI should be as updated as it is possible (well this can be done by the refresh frequency of the cache).
Major difference between a regular cache implementation and this particular cache is that it needs a bulk refresh of all the elements at regular intervals and hence is pretty different from an LRU kind of cache.
I need to do this implementation in Java and it will be great if there are any existing frameworks which can be utilized to get this built around them.
I have explored the Google Guava cache library but it is more suited to a per entry refresh rather than a bulk refresh. There are no simple APIs which do a refresh on the whole cache.
Any help will be highly appreciated.
Also, if it is possible to incrementally do the refresh, it shall be great because the only limitation which arises while refreshing the whole cache is that if the cache is very big in size, then the memory heap should be atleast twice the size of the cache in order to load the new entries and replace the old map with the new one. If the cache is incremental or there is a chunked refresh (refresh in equal sizes) it will be great.
EHCache is a pretty full-featured java caching library. i would imagine they have something which would work for you.
In order to do an incremental reload of a cache (which would work on most any cache), just iterate through the currently loaded entries and force refresh them. (you could run this task on a background scheduler).
As an alternative to forcing the entire cache to reload, EHCache has the ability to specify a "time-to-live" for an entry, so entries will automatically be reloaded if they are too stale.
Just inherit this class, and implement loadDataFromDB and updateData as you want to get the incremential updates
import org.apache.log4j.Logger;
import java.util.List;
import java.util.concurrent.Semaphore;
public abstract class Updatable<T>
{
protected volatile long lastRefreshed = 0;
private final int REFRESH_FREQUENCY_MILLISECONDS = 300000; // 5 minutes
private Thread updateThread;
private final Semaphore updateInProgress = new Semaphore(1);
protected static final Logger log = Logger.getLogger(Updatable.class);
public void forceRefresh()
{
try
{
updateInProgress.acquire();
}
catch (InterruptedException e)
{
log.warn("forceRefresh Interrupted");
}
try
{
loadAllData();
}
catch (Exception e)
{
log.error("Exception while updating data from DB", e);
}
finally
{
updateInProgress.release();
}
}
protected void checkRefresh()
{
if (lastRefreshed + REFRESH_FREQUENCY_MILLISECONDS < System.currentTimeMillis())
startUpdateThread();
}
private void startUpdateThread()
{
if (updateInProgress.tryAcquire())
{
updateThread = new Thread(new Runnable()
{
public void run()
{
try
{
loadAllData();
}
catch (Exception e)
{
log.error("Exception while updating data from DB", e);
}
finally
{
updateInProgress.release();
}
}
});
updateThread.start();
}
}
/**
* implement this function to load the data from DB
*
* @return
*/
protected abstract List<T> loadFromDB();
/**
* Implement this function to hotswap the data in memory after it was loaded from DB
*
* @param data
*/
protected abstract void updateData(List<T> data);
private void loadAllData()
{
List<T> l = loadFromDB();
updateData(l);
lastRefreshed = System.currentTimeMillis();
}
public void invalidateCache()
{
lastRefreshed = 0;
}
}
One thing which has to be checked is that is periodic refreshing required? You could apply your refresh logic once you are fetching data from the cache, this would remove the need to any asynchronous refreshing and would remove the need of maintaining any old copies of the cache. This IMO is the easiest and best way to refresh cache data as it does not involve any additional overheads.
T getData(){
// check if the last access time + refresh interval >= currenttime if so then refresh cache
// return data
}
This will ensure that the data is refreshed based on the refresh interval and it does not need any asynchronous refresh .