I have multiple socket.io servers scaled horizontally using a redisstore. I've got rooms setup effectively and am successfully able to broadcast to rooms across servers, etc. Now I'm trying to build a status page and what I'm failing on figuring out is how to simply count the number of users connected across all servers.
io.sockets.clients('room') and io.sockets.sockets will only tell you the number of connected clients on that one server, not all servers connected to the same RedisStore.
Suggestions?
Thanks.
Here is how I solved it using Redis scripting. It requires version 2.6 or later, so most likely still requires compiling your own instance for now.
Each time a process starts up, I generate a new UUID and leave it in the global scope. I could use the pid, but this feels a little safer.
When a user connects (the socket.io connection event), I then push that user's id into a list of users based on that processId. I also set the expiry of that key to 30 seconds.
When a user disconnects (the disconnect event), I remove it and update the expiry.
I also setup an function that runs on a 30 second interval to essentially "ping" that key so that it stays there. So if the process does accidentally die, all those user sessions will essentially disappear.
Now for the magic. Redis 2.6 includes LUA scripting, which essentially gives a stored procedure sort of functionality. It's really fast and not very processor intensive (they compare it to "almost" running C code).
My stored procedure basically loops through all of the process lists, and creates a user:user_id key with their total count of current logins. This means that if they're logged in with two browsers, etc. it'll still allow me to use logic to tell if they've disconnected completely, or just one of their sessions.
I run this function every 15 seconds on all my processes, and also after a connect/disconnect event. This means that my user counts will most likely be accurate to the second, and never incorrect for more than 15 to 30 seconds.
The code to generate that redis function looks like:
And then I can use that in my script later on with:
The last thing I do is try and handle some shutdown events to make sure that the process tries it's best to not rely on the redis timeouts and actually shuts down gracefully.
So far it's been working great.
One thing I still want to do is make it so that the redis function returns any keys that have changed their values. This way I could actually send out an event if the counts have changed for a particular user without any of the servers actively knowing (like if a process dies). For now, I have to rely on polling the user:* values again to know that it's changed. It works, but it could be better...
I solved this by having each server periodically set a user count in redis with an expiration that included their own pid:
every do
setex userCount:<pid> <interval+10> <count>
then the status server can query for each of these keys, and then get the values for each key:
for each
keys userCount*
do total+=get <key>
so if a server crashes or is shutdown then its counts will drop out of redis after interval+10
sorry about the ugly pseudocode. :)
When a user connects to the chatroom, you can atomically increment a user counter in your RedisStore. When a user disconnects, you decrement the value. This way Redis maintains the user count and is accessible to all servers.
See INCR and DECR
When a user connects:
When a user disconnects:
You could use hash keys to store the values.
When a user connects to server 1 you could set a field called "srv1" on a key called "userCounts". Just override the value to whatever the current count is using HSET. No need to increment/decrement. Just set the current value known by socket.io.
When another user connects to a different server set a different field.
Then any server can get the total by returning all the fields from "userCounts" and adding them together using HVALS to return a value list.
When a server crashes you'll need to run a script in response to the crash that removes that server's field from userCounts or HSET it to "0".
You can look at Forever to automate restarting the server.