Counting socket.io users across horizontal servers

2019-03-16 03:39发布

I have multiple socket.io servers scaled horizontally using a redisstore. I've got rooms setup effectively and am successfully able to broadcast to rooms across servers, etc. Now I'm trying to build a status page and what I'm failing on figuring out is how to simply count the number of users connected across all servers.

io.sockets.clients('room') and io.sockets.sockets will only tell you the number of connected clients on that one server, not all servers connected to the same RedisStore.

Suggestions?

Thanks.

4条回答
Anthone
2楼-- · 2019-03-16 03:54

Here is how I solved it using Redis scripting. It requires version 2.6 or later, so most likely still requires compiling your own instance for now.

Each time a process starts up, I generate a new UUID and leave it in the global scope. I could use the pid, but this feels a little safer.

# Pardon my coffeescript
processId = require('node-uuid').v4()

When a user connects (the socket.io connection event), I then push that user's id into a list of users based on that processId. I also set the expiry of that key to 30 seconds.

RedisClient.lpush "process:#{processId}", user._id
RedisClient.expire "process:#{processId}", 30

When a user disconnects (the disconnect event), I remove it and update the expiry.

RedisClient.lrem "process:#{processId}", 1, user._id
RedisClient.expire "process:#{processId}", 30

I also setup an function that runs on a 30 second interval to essentially "ping" that key so that it stays there. So if the process does accidentally die, all those user sessions will essentially disappear.

setInterval ->
  RedisClient.expire "process:#{processId}", 30
, 30 * 1000

Now for the magic. Redis 2.6 includes LUA scripting, which essentially gives a stored procedure sort of functionality. It's really fast and not very processor intensive (they compare it to "almost" running C code).

My stored procedure basically loops through all of the process lists, and creates a user:user_id key with their total count of current logins. This means that if they're logged in with two browsers, etc. it'll still allow me to use logic to tell if they've disconnected completely, or just one of their sessions.

I run this function every 15 seconds on all my processes, and also after a connect/disconnect event. This means that my user counts will most likely be accurate to the second, and never incorrect for more than 15 to 30 seconds.

The code to generate that redis function looks like:

def = require("promised-io/promise").Deferred

reconcileSha = ->
  reconcileFunction = "
    local keys_to_remove = redis.call('KEYS', 'user:*')
    for i=1, #keys_to_remove do
      redis.call('DEL', keys_to_remove[i])
    end

    local processes = redis.call('KEYS', 'process:*')
    for i=1, #processes do
      local users_in_process = redis.call('LRANGE', processes[i], 0, -1)
      for j=1, #users_in_process do
        redis.call('INCR', 'user:' .. users_in_process[j])
      end
    end
  "

  dfd = new def()
  RedisClient.script 'load', reconcileFunction, (err, res) ->
    dfd.resolve(res)
  dfd.promise

And then I can use that in my script later on with:

reconcileSha().then (sha) ->
  RedisClient.evalsha sha, 0, (err, res) ->
    # do stuff

The last thing I do is try and handle some shutdown events to make sure that the process tries it's best to not rely on the redis timeouts and actually shuts down gracefully.

gracefulShutdown = (callback) ->
  console.log "shutdown"
  reconcileSha().then (sha) ->
    RedisClient.del("process:#{processId}")
    RedisClient.evalsha sha, 0, (err, res) ->
      callback() if callback?

# For ctrl-c
process.once 'SIGINT', ->
  gracefulShutdown ->
    process.kill(process.pid, 'SIGINT')

# For nodemon
process.once 'SIGUSR2', ->
  gracefulShutdown ->
    process.kill(process.pid, 'SIGUSR2')

So far it's been working great.

One thing I still want to do is make it so that the redis function returns any keys that have changed their values. This way I could actually send out an event if the counts have changed for a particular user without any of the servers actively knowing (like if a process dies). For now, I have to rely on polling the user:* values again to know that it's changed. It works, but it could be better...

查看更多
倾城 Initia
3楼-- · 2019-03-16 04:03

I solved this by having each server periodically set a user count in redis with an expiration that included their own pid:

every do setex userCount:<pid> <interval+10> <count>

then the status server can query for each of these keys, and then get the values for each key:

for each keys userCount* do total+=get <key>

so if a server crashes or is shutdown then its counts will drop out of redis after interval+10

sorry about the ugly pseudocode. :)

查看更多
等我变得足够好
4楼-- · 2019-03-16 04:07

When a user connects to the chatroom, you can atomically increment a user counter in your RedisStore. When a user disconnects, you decrement the value. This way Redis maintains the user count and is accessible to all servers.

See INCR and DECR

SET userCount = "0"

When a user connects:

INCR userCount

When a user disconnects:

DECR userCount
查看更多
姐就是有狂的资本
5楼-- · 2019-03-16 04:07

You could use hash keys to store the values.

When a user connects to server 1 you could set a field called "srv1" on a key called "userCounts". Just override the value to whatever the current count is using HSET. No need to increment/decrement. Just set the current value known by socket.io.

HSET userCounts srv1 "5"

When another user connects to a different server set a different field.

HSET userCounts srv2 "10"

Then any server can get the total by returning all the fields from "userCounts" and adding them together using HVALS to return a value list.

HVALS userCounts

When a server crashes you'll need to run a script in response to the crash that removes that server's field from userCounts or HSET it to "0".

You can look at Forever to automate restarting the server.

查看更多
登录 后发表回答