Determine the current number of backlogged connect

2019-04-07 19:20发布

问题:

Is there a way to find out the current number of connection attempts awaiting accept() on a TCP socket on Linux?

I suppose I could count the number of accepts() that succeed before hitting EWOULDBLOCK on each event loop, but I'm using a high-level library (Python/Twisted) that hides these details. Also it's using epoll() rather than an old-fashioned select()/poll() loop.

I am trying to get a general sense of the load on a high-performance non-blocking network server, and I think this number would be a good characterization. Load average/CPU statistics aren't helping much, because I'm doing a lot of disk I/O in concurrent worker processes. Most of these stats on Linux count time spent waiting on disk I/O as part of the load (which it isn't, for my particular server architecture). Latency between accept() and response isn't a good measure either, since each request usually gets processed very quickly once the server gets around to it. I'm just trying to find out how close I am to reaching a breaking point where the server can't dispatch requests faster than they are coming in.

回答1:

There is no function for this in the BSD Sockets API that I have ever seen. I question whether it is really a useful measure of load. You are assuming no connection pooling by clients, for one thing, and you are also assuming that latency is entirely manifested as pending connections. But as you can't get the number anyway the point is moot.



回答2:

Assuming SYN cookies aren't enabled (or haven't been triggered due to volume), I think you should be able to get an approximate figure just by examing the output of netstat and seeing how many connections targeting your port are in a SYN_RECV state.

Here's a little Python hack that will get that figure for you for a given listening port:

!/usr/bin/python

import sys

STATE_SYN_RECV = '03'

def count_state(find_port, find_state):
    count = 0
    with open('/proc/net/tcp', 'r') as f:
        first = True
        for line in f:
            if first:
                first = False
                continue
            entries = line.split()
            local_addr, local_port = entries[1].split(':')
            local_port = int(local_port, 16)
            if local_port != find_port:
                continue
            state = entries[3]
            if state == find_state:
                count += 1
    return count


if __name__ == '__main__':
    if len(sys.argv) != 2:
        print "Usage: count_syn_recv.py <port>"
        sys.exit(1)

    port = int(sys.argv[1])

    count = count_state(port, STATE_SYN_RECV)
    print "syn_recv_count=%d" % count


回答3:

You can look at the unacked value in the output of, for example when examining port 80:

ss -lti '( sport = :http )'

The output could look like this:

State  Recv-Q  Send-Q  Local Address:Port  Peer Address:Port   
LISTEN    123      0              :::http               :::*
    rto:0.99 mss:536 cwnd:10 unacked:123

For a proof with kernel sources and all that unacked is indeed the TCP connection backlog, see the detailed article "Apache TCP Backlog" by Ryan Frantz. Note that you may need a pretty new version of ss for the unacked output to be included. At least mine (iproute2-ss131122) does not provide it.