High performance caching

2020-07-10 10:26发布

The following code is supposed to cache the last read. The LastValueCache is a cache that can be accessed by many threads (this is why I use the shared-memory). It is OK for me to have race conditions but I want that other threads would see the changed LastValueCache.

class Repository
{
    public Item LastValueCache
    {
        get
        {
            Thread.MemoryBarrier();
            SomeType result = field;
            Thread.MemoryBarrier();
            return result;
        }
        set
        {
            Thread.MemoryBarrier();
            field = value;
            Thread.MemoryBarrier();
        }
    }

    public void Save(Item item)
    {
        SaveToDatabase(item);
        Item cached = LastValueCache;
        if (cached == null || item.Stamp > cached.Stamp)
        {
            LastValueCache = item;
        }
    }

    public void Remove(Timestamp stamp)
    {
        RemoveFromDatabase(item);
        Item cached = LastValueCache;
        if (cached != null && cached.Stamp == item.Stamp)
        {
            LastValueCache = null;
        }
    }

    public Item Get(Timestamp stamp)
    {
        Item cached = LastValueCache;
        if (cached != null && cached.Stamp == stamp)
        {
            return cached;
        }

        return GetFromDatabase(stamp);
    }
}

The Repository object is used by many threads. I do not want to use locking because it would affect performance which in my case is more important than data consistency. The question is what smallest synchronizing mechanism that will fit my need? Maybe volatile or single MemoryBarrier in get and set would be enough?

4条回答
家丑人穷心不美
2楼-- · 2020-07-10 10:44

I implemented a thread safe pseudo LRU designed for concurrent workloads: ConcurrentLru. Performance is very close to ConcurrentDictionary, ~10x faster than MemoryCache and hit rate is better than a conventional LRU. Full analysis provided in the github link below.

Usage looks like this:

int capacity = 666;
var lru = new ConcurrentLru<int, SomeItem>(capacity);

var value = lru.GetOrAdd(1, (k) => new SomeItem(k));

GitHub: https://github.com/bitfaster/BitFaster.Caching

Install-Package BitFaster.Caching
查看更多
\"骚年 ilove
3楼-- · 2020-07-10 10:44

Alternative use may be the ReaderWriterLockSlim class to enable thread synchronization for reader/writer scenarios like yours.

    ReaderWriterLockSlim _rw = new ReaderWriterLockSlim();
    get
    {
        _rw.EnterReadLock();
        SomeType result = field;
        _rw.ExitReadLock();
        return result;
    }
    set
    {
        _rw.EnterWriteLock();
        field = value;
        _rw.ExitWriteLock();
    }

This is an efficent implementation for Reader/Writer synchronization. This blog entry of Eric Lippert may be interessting for you.

Another intersting Option may be the ConcurrentExclusiveSchedulerPair class that creates two TaskScheduler for Tasks, wich can be splitted in 'Task with read access' (Concurrent) and 'Tasks with write access' (Exclusive).

More information about this can found here.

查看更多
放荡不羁爱自由
4楼-- · 2020-07-10 10:45

volatile should be sufficient and most performant for your case. There are potential scenarios where volatile alone is not sufficient for ensuring the read gets the most up-to-date value, but in this case, I don't think it would be an important factor. See the "the volatile keyword" section of this awesome article by Joe Albahari for details.

查看更多
手持菜刀,她持情操
5楼-- · 2020-07-10 10:56

If this is stupid you don't need to vote me down.
Just tell me and I will delete.
But I am not following this logic.

public void Save(Item item)
{
    SaveToDatabase(item);
    Item cached = LastValueCache;
    if (cached == null || item.Stamp > cached.Stamp)
    {
        LastValueCache = item;
    }
}

You are worried about memory milliseconds but you are waiting on a write to the database before you update the cache.
Based on the public Item Get stamp is a key.

Lets assume a db write is 20 ms
A db read is 10 ms
Cache get and cache set are each 2 ms

public void Save(Item item)
SaveToDatabase(item); 20 ms
Item cached = LastValueCache; 2 ms
if (cached == null || item.Stamp > cached.Stamp) 1 ms
LastValueCache = item; 2 ms

During that 23 ms prior to the LastValueCache = item; any call to public Item Get(Timestamp stamp) is going to hit the DataBase and not the cache.

During the 23 ms prior to the LastValueCache = item; any call to public Item LastValueCache get is going to get a value that is stale by up 23 ms. The stated objective is for other threads to see LastValueCache - but the are seeing a stale LastValueCache.

Same thing with Remove.
You are going to have several hits to the database you could have avoided.

What are you trying to achieve?
Have you profiled this?

My bet is the bottle neck is the calls to the database.
A database call is 1000x longer than the difference between a lock and MemoryBarrier.

public void Save(Item item)
{   
   // add logic that the prior asynchonous call to SaveToDatabase is complete
   // if not wait for it to complete 
   // LastValueCache will possible be replaced so you need last item in the database 
   // the time for a lock is not really a factor as it will be faster than the prior update 
   Item cached = LastValueCache;
   if (cached == null || item.Stamp > cached.Stamp)
   {
       LastValueCache = item;
   }
   // make the next a task or background so it does not block    
   SaveToDatabase(item);
}

Could even change the logic to only wait for prior call if you setting LastValueCache = item;
But you need to throttle the database somehow

The next step would be to cache the last X and use that in Item Get(Timestamp stamp)
The database are calls are what you need to optimize
Again you need to profile

After that the logic would get more complex but feed the database calls to a BlockingCollection. Would need to be sure the last X cache is bigger than the BlockingCollections size. If not block and wait for the BC to clear. And you would need to use the same BC for insert and delete so they are processed in order. Could get smart enough that you just don't insert a records that has a delete. And don't just insert or delete a single record at a time.

查看更多
登录 后发表回答