I have a need to create a server farm that can handle 5+ million connections, 5+ million topics (one per client), process 300k messages/sec.
I tried to see what various message brokers were capable so I am currently using two RHEL EC2 instances (r3.4xlarge) to make lots of available resources. So you do not need to look it up, it has 16vCPU, 122GB RAM. I am nowhere near that limit in usage.
I am unable to pass the 600k connections limit. Since there doesn't seem to be any O/S limitation (plenty of RAM/CPU/etc.) on either the client nor the server what is limiting me?
I have edited /etc/security/limits.conf as follows:
* soft nofile 20000000
* hard nofile 20000000
* soft nproc 20000000
* hard nproc 20000000
root soft nofile 20000000
root hard nofile 20000000
I have edited /etc/sysctl.conf as follows:
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_mem = 5242880 5242880 5242880
net.ipv4.tcp_tw_recycle = 1
fs.file-max = 20000000
fs.nr_open = 20000000
net.ipv4.tcp_syncookies = 0
net.ipv4.tcp_max_syn_backlog = 10000
net.ipv4.tcp_synack_retries = 3
net.core.somaxconn=65536
net.core.netdev_max_backlog=100000
net.core.optmem_max = 20480000
For Apollo: export APOLLO_ULIMIT=20000000
For ActiveMQ:
ACTIVEMQ_OPTS="$ACTIVEMQ_OPTS -Dorg.apache.activemq.UseDedicatedTaskRunner=false"
ACTIVEMQ_OPTS_MEMORY="-Xms50G -Xmx115G"
I created 20 additional private addresses for eth0 on the client, then assigned them: ip addr add 11.22.33.44/24 dev eth0
I am FULLY aware of the 65k port limits which is why I did the above.
- For ActiveMQ I got to: 574309
- For Apollo I got to: 592891
- For Rabbit I got to 90k but logging was awful and couldn't figure out what to do to go higher although I know its possible.
- For Hive I got to trial limit of 1000. Awaiting a license
- IBM wants to trade the cost of my house to use them - nah!
ANSWER: While doing this I realized that I had a misspelling in my client setting within /etc/sysctl.conf file for: net.ipv4.ip_local_port_range
I am now able to connect 956,591 MQTT clients to my Apollo server in 188sec.
More info: Trying to isolate if this is an O/S connection limitation or a Broker, I decided to write a simple Client/Server.
The server:
The Client:
With 21 IPs, I would expect 65535-1024*21 = 1354731 to be the boundary. In reality I am able to achieve 1231734
So the socket/kernel/io stuff is worked out.
I am STILL unable to achieve this using any broker.
Again just after my client/server test this is the kernel settings.
Client:
Server: