I am using StackExchange Redis DB to insert a dictionary of Key value pairs using Batch
as below:
private static StackExchange.Redis.IDatabase _database;
public void SetAll<T>(Dictionary<string, T> data, int cacheTime)
{
lock (_database)
{
TimeSpan expiration = new TimeSpan(0, cacheTime, 0);
var list = new List<Task<bool>>();
var batch = _database.CreateBatch();
foreach (var item in data)
{
string serializedObject = JsonConvert.SerializeObject(item.Value, Formatting.Indented,
new JsonSerializerSettings { ContractResolver = new SerializeAllContractResolver(), ReferenceLoopHandling = ReferenceLoopHandling.Ignore });
var task = batch.StringSetAsync(item.Key, serializedObject, expiration);
list.Add(task);
serializedObject = null;
}
batch.Execute();
Task.WhenAll(list.ToArray());
}
}
My problem: It takes around 7 seconds to set just 350 items of dictionary.
My question: Is this the right way to set bulk items into Redis or is there a quicker way to do this? Any help is appreciated. Thanks.
"just" is a very relative term, and doesn't really make sense without more context, in particular: how big are these payloads?
however, to clarify a few points to help you investigate:
IDatabase
unless that is purely for your own purposes; SE.Redis deals with thread safety internally and is intended to be used by competing threadsJsonConvert.SerializeObject
); this will add up, especially if your objects are big; to get a decent measure, I strongly suggest you time the serialization and redis times separatelybatch.Execute()
method uses a pipeline API and does not wait for responses between calls, so: the time you're seeing is not the cumulative effect of latency; that leaves just local CPU (for serialization), network bandwidth, and server CPU; the client library tools can't impact any of those thingsStringSet
overload that accepts aKeyValuePair<RedisKey, RedisValue>[]
; you could choose to use this instead of a batch, but the only difference here is that it is the varadicMSET
rather than mulipleSET
; either way, you'll be blocking the connection for other callers for the duration (since the purpose of batch is to make the commands contiguous)CreateBatch
here, especially since you're locking the database (but I still suggest you don't need to do this); the purpose ofCreateBatch
is to make a sequence of commands sequential, but I don't see that you need this here; you could just use_database.StringSetAsync
for each command in turn, which would also have the advantage that you'd be running serialization in parallel to the previous command being sent - it would allow you to overlap serialization (CPU bound) and redis ops (IO bound) without any work except to delete theCreateBatch
call; this will also mean that you don't monopolize the connection from other callersSo; the first thing I would do would be to remove some code:
The second thing I would do would be to time the serialization separately to the redis work.
The thrid thing I would do would be to see if I can serialize to a
MemoryStream
instead, ideally one that I can re-use - to avoid thestring
alocation and UTF-8 encode:This second answer is kinda tangential, but based on the discussion it sounds as though the main cost is serialization:
One thing you could do here is not store JSON. JSON is relatively large, and being text-based is relatively expensive to process both for serialization and deserialization. Unless you're using
rejson
, redis just treats your data as an opaque blob, so it doesn't care what the actual value is. As such, you can use more efficient formats.I'm hugely biased, but we make use of protobuf-net in our redis storage. protobuf-net is optimized for:
I suggest protobuf-net rather than Google's own C# protobuf library because of the last bullet point, meaning: you can use it with the data you already have.
To illustrate why, I'll use this image from https://aloiskraus.wordpress.com/2017/04/23/the-definitive-serialization-performance-guide/:
Notice in particular that the output size of protobuf-net is half that of Json.NET (reducing the bandwidth cost), and the serialization time is less than one fifth (reducing local CPU cost).
You would need to add some attributes to your model to help protobuf-net out (as per How to convert existing POCO classes in C# to google Protobuf standard POCO), but then this would be just:
As you can see, the code change to your redis code is minimal. Obviously you would need to use
Deserialize<T>
when reading the data back.If your data is text based, you might also consider running the serialization through
GZipStream
orDeflateStream
; if your data is dominated by text, it will compress very well.