I am programming a server and it seems like my number of connections is being limited since my bandwidth isn't being saturated even when I've set the number of connections to "unlimited".
How can I increase or eliminate a maximum number of connections that my Ubuntu Linux box can open at a time? Does the OS limit this, or is it the router or the ISP? Or is it something else?
There are a couple of variables to set the max number of connections. Most likely, you're running out of file numbers first. Check ulimit -n. After that, there are settings in /proc, but those default to the tens of thousands.
More importantly, it sounds like you're doing something wrong. A single TCP connection ought to be able to use all of the bandwidth between two parties; if it isn't:
ping -s 1472
...)tc
iperf
Possibly I have misunderstood. Maybe you're doing something like Bittorrent, where you need lots of connections. If so, you need to figure out how many connections you're actually using (try
netstat
orlsof
). If that number is substantial, you might:ulimit -n
. Still, ~1000 connections (default on my system) is quite a few.iostat -x
?Also, if you are using a consumer-grade NAT router (Linksys, Netgear, DLink, etc.), beware that you may exceed its abilities with thousands of connections.
I hope this provides some help. You're really asking a networking question.
Maximum number of connections are impacted by certain limits on both client & server sides, albeit a little differently.
On the client side: Increase the ephermal port range, and decrease the
tcp_fin_timeout
To find out the default values:
The ephermal port range defines the maximum number of outbound sockets a host can create from a particular I.P. address. The
fin_timeout
defines the minimum time these sockets will stay inTIME_WAIT
state (unusable after being used once). Usual system defaults are:net.ipv4.ip_local_port_range = 32768 61000
net.ipv4.tcp_fin_timeout = 60
This basically means your system cannot consistently guarantee more than
(61000 - 32768) / 60 = 470
sockets per second. If you are not happy with that, you could begin with increasing theport_range
. Setting the range to15000 61000
is pretty common these days. You could further increase the availability by decreasing thefin_timeout
. Suppose you do both, you should see over 1500 outbound connections per second, more readily.To change the values:
The above should not be interpreted as the factors impacting system capability for making outbound connections per second. But rather these factors affect system's ability to handle concurrent connections in a sustainable manner for large periods of "activity."
Default Sysctl values on a typical Linux box for
tcp_tw_recycle
&tcp_tw_reuse
would beThese do not allow a connection from a "used" socket (in wait state) and force the sockets to last the complete
time_wait
cycle. I recommend setting:This allows fast cycling of sockets in
time_wait
state and re-using them. But before you do this change make sure that this does not conflict with the protocols that you would use for the application that needs these sockets. Make sure to read post "Coping with the TCP TIME-WAIT" from Vincent Bernat to understand the implications. Thenet.ipv4.tcp_tw_recycle
option is quite problematic for public-facing servers as it won’t handle connections from two different computers behind the same NAT device, which is a problem hard to detect and waiting to bite you. Note thatnet.ipv4.tcp_tw_recycle
has been removed from Linux 4.12.On the Server Side: The
net.core.somaxconn
value has an important role. It limits the maximum number of requests queued to a listen socket. If you are sure of your server application's capability, bump it up from default 128 to something like 128 to 1024. Now you can take advantage of this increase by modifying the listen backlog variable in your application's listen call, to an equal or higher integer.txqueuelen
parameter of your ethernet cards also have a role to play. Default values are 1000, so bump them up to 5000 or even more if your system can handle it.Similarly bump up the values for
net.core.netdev_max_backlog
andnet.ipv4.tcp_max_syn_backlog
. Their default values are 1000 and 1024 respectively.Now remember to start both your client and server side applications by increasing the FD ulimts, in the shell.
Besides the above one more popular technique used by programmers is to reduce the number of tcp write calls. My own preference is to use a buffer wherein I push the data I wish to send to the client, and then at appropriate points I write out the buffered data into the actual socket. This technique allows me to use large data packets, reduce fragmentation, reduces my CPU utilization both in the user land and at kernel-level.
To improve upon the answer given by derobert,
You can determine what your OS connection limit is by catting nf_conntrack_max.
For example: cat /proc/sys/net/netfilter/nf_conntrack_max
You can use the following script to count the number of tcp connections to a given range of tcp ports. By default 1-65535.
This will confirm whether or not you are maxing out your OS connection limit.
Here's the script.
In an application level, here are something a developer can do:
From server side:
Check if load balancer(if you have),works correctly.
Turn slow TCP timeouts into 503 Fast Immediate response, if you load balancer work correctly, it should pick the working resource to serve, and it's better than hanging there with unexpected error massages.
Eg: If you are using node server, u can use toobusy from npm. Implementation something like:
Why 503? Here are some good insights for overload: http://ferd.ca/queues-don-t-fix-overload.html
We can do some work in client side too:
Try to group calls in batch, reduce the traffic and total requests number b/w client and server.
Try to build a cache mid-layer to handle unnecessary duplicates requests.