How does PubSub work in BookSleeve/ Redis?

2020-07-23 02:12发布

问题:

I wonder what the best way is to publish and subscribe to channels using BookSleeve. I currently implement several static methods (see below) that let me publish content to a specific channel with the newly created channel being stored in private static Dictionary<string, RedisSubscriberConnection> subscribedChannels;.

Is this the right approach, given I want to publish to channels and subscribe to channels within the same application (note: my wrapper is a static class). Is it enough to create one channel even I want to publish and subscribe? Obviously I would not publish to the same channel than I would subscribe to within the same application. But I tested it and it worked:

 RedisClient.SubscribeToChannel("Test").Wait();
 RedisClient.Publish("Test", "Test Message");

and it worked.

Here my questions:

1) Will it be more efficient to setup a dedicated publish channel and a dedicated subscribe channel rather than using one channel for both?

2) What is the difference between "channel" and "PatternSubscription" semantically? My understanding is that I can subscribe to several "topics" through PatternSubscription() on the same channel, correct? But if I want to have different callbacks invoked for each "topic" I would have to setup a channel for each topic correct? Is that efficient or would you advise against that?

Here the code snippets.

Thanks!!!

    public static Task<long> Publish(string channel, byte[] message)
    {
        return connection.Publish(channel, message);
    }

    public static Task SubscribeToChannel(string channelName)
    {
        string subscriptionString = ChannelSubscriptionString(channelName);

        RedisSubscriberConnection channel = connection.GetOpenSubscriberChannel();

        subscribedChannels[subscriptionString] = channel;

        return channel.PatternSubscribe(subscriptionString, OnSubscribedChannelMessage);
    }

    public static Task UnsubscribeFromChannel(string channelName)
    {
        string subscriptionString = ChannelSubscriptionString(channelName);

        if (subscribedChannels.Keys.Contains(subscriptionString))
        {
            RedisSubscriberConnection channel = subscribedChannels[subscriptionString];

            Task  task = channel.PatternUnsubscribe(subscriptionString);

            //remove channel subscription
            channel.Close(true);
            subscribedChannels.Remove(subscriptionString);

            return task;
        }
        else
        {
            return null;
        }
    }

    private static string ChannelSubscriptionString(string channelName)
    {
        return channelName + "*";
    }

回答1:

1: there is only one channel in your example (Test); a channel is just the name used for a particular pub/sub exchange. It is, however, necessary to use 2 connections due to specifics of how the redis API works. A connection that has any subscriptions cannot do anything else except:

  • listen to messages
  • manage its own subscriptions (subscribe, psubscribe, unsubscribe, punsubscribe)

However, I don't understand this:

private static Dictionary<string, RedisSubscriberConnection>

You shouldn't need more than one subscriber connection unless you are catering for something specific to you. A single subscriber connection can handle an arbitrary number of subscriptions. A quick check on client list on one of my servers, and I have one connection with (at time of writing) 23,002 subscriptions. Which could probably be reduced, but: it works.

2: pattern subscriptions support wildcards; so rather than subscribing to /topic/1, /topic/2/ etc you could subscribe to /topic/*. The name of the actual channel used by publish is provided to the receiver as part of the callback signature.

Either can work. It should be noted that the performance of publish is impacted by the total number of unique subscriptions - but frankly it is still stupidly fast (as in: 0ms) even if you have tens of multiple thousands of subscribed channels using subscribe rather than psubscribe.

But from publish

Time complexity: O(N+M) where N is the number of clients subscribed to the receiving channel and M is the total number of subscribed patterns (by any client).

I recommend reading the redis documentation of pub/sub.


Edit for follow on questions:

a) I assume I would have to "publish" synchronously (using Result or Wait()) if I want to guarantee the order of sending items from the same publisher is preserved when receiving items, correct?

that won't make any difference at all; since you mention Result / Wait(), I assume you're talking about BookSleeve - in which case the multiplexer already preserves command order. Redis itself is single threaded, and will always process commands on a single connection in order. However: the callbacks on the subscriber may be executed asynchronously and may be handed (separately) to a worker thread. I am currently investigating whether I can force this to be in-order from RedisSubscriberConnection.

Update: from 1.3.22 onwards you can set the CompletionMode to PreserveOrder - then all callbacks will be completed sequentially rather than concurrently.

b) after making adjustments according to your suggestions I get a great performance when publishing few items regardless of the size of the payload. However, when sending 100,000 or more items by the same publisher performance drops rapidly (down to 7-8 seconds just to send from my machine).

Firstly, that time sounds high - testing locally I get (for 100,000 publications, including waiting for the response for all of them) 1766ms (local) or 1219ms (remote) (that might sound counter-intuitive, but my "local" isn't running the same version of redis; my "remote" is 2.6.12 on Centos; my "local" is 2.6.8-pre2 on Windows).

I can't make your actual server faster or speed up the network, but: in case this is packet fragmentation, I have added (just for you) a SuspendFlush() / ResumeFlush() pair. This disables eager-flushing (i.e. when the send-queue is empty; other types of flushing still happen); you might find this helps:

conn.SuspendFlush();
try {
    // start lots of operations...
} finally {
    conn.ResumeFlush();
}

Note that you shouldn't Wait until you have resumed, because until you call ResumeFlush() there could be some operations still in the send-buffer. With that all in place, I get (for 100,000 operations):

local: 1766ms (eager-flush) vs 1554ms (suspend-flush)
remote: 1219ms (eager-flush) vs 796ms (suspend-flush)

As you can see, it helps more with remote servers, as it will be putting fewer packets through the network.

I cannot use transactions because later on the to-be-published items are not all available at once. Is there a way to optimize with that knowledge in mind?

I think that is addressed by the above - but note that recently CreateBatch was added too. A batch operates a lot like a transaction - just: without the transaction. Again, it is another mechanism to reduce packet fragmentation. In your particular case, I suspect the suspend/resume (on flush) is your best bet.

Do you recommend having one general RedisConnection and one RedisSubscriberConnection or any other configuration to have such wrapper perform desired functions?

As long as you're not performing blocking operations (blpop, brpop, brpoplpush etc), or putting oversized BLOBs down the wire (potentially delaying other operations while it clears), then a single connection of each type usually works pretty well. But YMMV depending on your exact usage requirements.