On Ubuntu, the maximum number of sockets which can be opened seems to be governed from following:
$ cat /proc/sys/net/ipv4/tcp_max_orphans
262144
As per one of the presentations by Rick Reed (from WhatsApp), these guys took it up to 2 million concurrent connections on a "single server" using FreeBSD and ErLang. My understanding is that we will always need some support from the kernel. And yes, looks like the tweaked the FreeBSD to have this capability:
hw.machine: amd64
hw.model: Intel(R) Xeon(R) CPU X5675 @ 3.07GHz
hw.ncpu: 24
hw.physmem: 103062118400
hw.usermem: 100556451840
kb@c123$ uname -rps
FreeBSD 8.2-STABLE amd64
jkb@c123$ cat /boot/loader.conf.local
kern.ipc.maxsockets=2400000
kern.maxfiles=3000000
kern.maxfilesperproc=2700000
So, looks like kernel can be tweaked to support so many physical connections, given that we have sufficient amount of memory, correct? If yes, then it looks pretty simple, then what is the hype about it? Or I am missing something?
Thanks.
Note there are three things here:
Getting the server to support two million connections. This is usually just tweaking the kernel such that the number of simultaneous connections are allowed and such that the context associated with each connection fit in (wired) main memory. The latter means you can't have a megabyte buffer space allocated for each connection.
Doing something with each connection. You map them into userspace in a process (Erlang in this case). Now, if each connection allocates too much data at the user space level we are back to square one. We can't do it.
Getting multiple cores to do something with the connections. This is necessary due to the sheer amount of work to be done. It is also the point where you want to avoid locking too much and so on.
You seem to be focused on 1. alone.
If you have enough RAM it's not too hard to handle 1M or more connections on linux. These guys handled 10 million connections with a java application on a single box using regular CentOS kernel with a few sysctl tweaks:
sysctl -w fs.file-max=12000500
sysctl -w fs.nr_open=20000500
ulimit -n 20000000
sysctl -w net.ipv4.tcp_mem='10000000 10000000 10000000'
sysctl -w net.ipv4.tcp_rmem='1024 4096 16384'
sysctl -w net.ipv4.tcp_wmem='1024 4096 16384'
sysctl -w net.core.rmem_max=16384
sysctl -w net.core.wmem_max=16384
Also they balanced /proc/irq/'s of the network adapter and added a tweak for better JVM work with huge pages:
sysctl -w vm.nr_hugepages=30720
Having two 6-core CPUs 57% loaded they served 1Gbps over 12M connections in 2013.
But you need HUGE amount of RAM for that. The above test was on a server with 96GB RAM, and 36GB of those were used by the kernel for buffers of 12M sockets.
To serve 1M connections with similar settings you'll need a server with at least 8GB RAM and 3-4GB of them would be used just for socket buffers.