We did some work on an app that lets people fire baseballs over the internet.
It lives entirely within Amazon's AWS ecosystem, and we're building off that for a new project. The stack includes:
-Dedicated MongoDB and Redis servers
-three different groups of nodejs servers
-additionally, we're making use of Amazon's API for server configuration and autoscaling
The issue we're facing is that we haven't been able to simulate more than about 15000 concurrent users (websocket connections) per instance. We should be getting considerably more; we think 10s of thousands. The server CPU usage is only at 40%.
Any thoughts on how to scale a node.js app to enable it to have many more simultaneous connections to a single server?
Every tcp connection has a file descriptor open in file operating system. It is important to set the limit to a number above what you need.
For example, in ubuntu you can see this limit by commands:
$ulimit -a
$ulimit -n
To set this limit permanently in Ubuntu, you need to change the file /etc/security/limits.conf and add these lines with the number you want:
* soft nofile 100000
* hard nofile 100000
And then restart:
$sudo reboot
A websocket is a TCP connection, no? And how long do your customers keep your connections open for?
A server will have a limit on the number of open TCP connections you can have. Your operating system will also have a limit on the number of open file handles a process may have at any one time.
So:
- what is the TCP open socket limit on your server, and
- what is the open file handle limit on your server
?
I would assume your starting to hit some of the kernel's default limits on the tcp stack/file descriptors. Have you tried any system-level optimizations yet? If so, which?