When conjoining socket.io/node.js and redis pub/sub in an attempt to create a real-time web broadcast system driven by server events that can handle multiple transports, there seems to be three approaches:
'createClient' a redis connection and subscribe to channel(s). On socket.io client connection, join the client into a socket.io room. In the redis.on("message", ...) event, call io.sockets.in(room).emit("event", data) to distribute to all clients in the relevant room. Like How to reuse redis connection in socket.io?
'createClient' a redis connection. On socket.io client connection, join the client into a socket.io room and subscribe to relevant redis channel(s). Include redis.on("message", ...) inside the client connection closure and on receipt of message call client.emit("event", data) to raise the event on the specific client. Like the answer in Examples in using RedisStore in socket.io
Use the RedisStore baked into socket.io and 'broadcast' from the single "dispatch" channel in Redis following the socketio-spec protocol.
Number 1 allows handling the Redis sub and associated event once for all clients. Number 2 offers a more direct hook into Redis pub/sub. Number 3 is simpler, but offers little control over the messaging events.
However, in my tests, all exhibit unexpectedly low performance with more than 1 connected client. The server events in question are 1,000 messages published to a redis channel as quickly as possible, to be distributed as quickly as possible. Performance is measured by timings at the connected clients (socket.io-client based that log timestamps into a Redis list for analysis).
What I surmise is that in option 1, server receives the message, then sequentially writes it to all connected clients. In option 2, server receives each message multiple times (once per client subscription) and writes it to the relevant client. In either case, the server doesn't get to the second message event until it's communicated to all connected clients. A situation clearly exacerbated with rising concurrency.
This seems at odds with the perceived wisdom of the stacks capabilities. I want to believe, but I'm struggling.
Is this scenario (low latency distribution of high volume of messages) just not an option with these tools (yet?), or am I missing a trick?
I thought this was a reasonable question and had researched it briefly a while back. I spent a little time searching for examples that you may be able to pick up some helpful tips from.
Examples
I like to begin with straight forward examples:
The light sample is a single page (note you'll want to replace redis-node-client with something like node_redis by Matt Ranney:
Documents
There's a ton of documentation out there, and the apis are rapidly changing on this type of stack so you'll have to weigh the time relevance of each doc.
Related Questions
Just a few related questions, this is a hot topic on stack:
Notable tips (ymmv)
Turn off or optimize socket pooling, use efficient bindings, monitor latency, and make sure you're not duplicating work (ie no need to publish to all listeners twice).