I have an ASP.NET site with a fairly slow search function, and I want to improve performance by adding the results to the cache for one hour using the query as the cache-key:
using System;
using System.Web;
using System.Web.Caching;
public class Search
{
private static object _cacheLock = new object();
public static string DoSearch(string query)
{
string results = "";
if (HttpContext.Current.Cache[query] == null)
{
lock (_cacheLock)
{
if (HttpContext.Current.Cache[query] == null)
{
results = GetResultsFromSlowDb(query);
HttpContext.Current.Cache.Add(query, results, null, DateTime.Now.AddHours(1), Cache.NoSlidingExpiration, CacheItemPriority.Normal, null);
}
else
{
results = HttpContext.Current.Cache[query].ToString();
}
}
}
else
{
results = HttpContext.Current.Cache[query].ToString();
}
return results;
}
private static string GetResultsFromSlowDb(string query)
{
return "Hello World!";
}
}
Let’s say visitor A does a search. The cache is empty, the lock is set and the result is requested from the database. Now visitor B comes along with a different search: Won’t visitor B have to wait by the lock until visitor A’s search has completed? What I really wanted was for B to call the database immediately, because the results will be different and the database can handle multiple requests – I just don’t want to repeat expensive unnecessary queries.
What would be the correct approach for this scenario?
Your code has a potential race condition:
In general I wouldn't use locking, and would do it as follows to avoid the race condition:
In the above case, multiple threads might attempt to load data if they detect a cache miss at about the same time. In practice this is likely to be rare, and in most cases unimportant, because the data they load will be equivalent.
But if you want to use a lock to prevent it you can do so as follows:
Unless you're absolutely certain that it's critical to have no redundant queries then I would avoid locking altogether. The ASP.NET cache is inherently thread-safe, so the only drawback to the following code is that you might temporarily see a few redundant queries racing each other when their associated cache entry expires:
If you decide that you really must avoid all redundant queries then you could use a set of more granular locks, one lock per query:
Your code is correct. You are also using
double-if-sandwitching-lock
which will prevent race conditions which is a common pitfall when not used. This will no lock access to existing stuff in the cache.The only problem is when many clients are inserting into the cache at the same time, and they will queue behind the lock but what I would do is to put the
results = GetResultsFromSlowDb(query);
outside the lock:If this is slow, your problem is elsewhere.