Many linux/unix programming books and tutorials speak about the "Thundering Herd Problem" which happens when multiple threads or forks are blocked on a select() call waiting for readability of a listening socket. When the connection comes in, all threads and forks are woken up but only one "wins" with a successful call to "accept()". In the meantime, a lot of cpu time is wasted waking up all the threads/forks for no reason.
I noticed a project which provides a "fix" for this problem in the linux kernel, but this is a very old patch.
I think there are two variants; One where each fork does select() and then accept(), and one that just does accept().
Do modern unix/linux kernels still have the Thundering Herd Problem in both these cases or only the "select() then accept()" version?
For years, most unix/linux kernels serialize response to accept(2)s, in other words, only one thread is waken up if more than one are blocking on accept(2) against a single open file description.
OTOH, many (if not all) kernels still have the thundering herd problem in the select-accept pattern as you describe.
I have written a simple script ( https://gist.github.com/kazuho/10436253 ) to verify the existence of the problem, and found out that the problem exists on linux 2.6.32 and Darwin 12.5.0 (OS X 10.8.5).
This is a very old problem, and for the most part does not exist any more. The Linux kernel (for the past few years) has had a number of changes with the way it handles and routes packets up the network stack, and includes many optimizations to ensure both low latency, and fairness (i.e., minimize starvation).
That said, the select system has a number of scalability issues simply by way of its API. When you have a large number of file descriptors, the cost of a select call is very high. This is primarily due to having to build, check, and maintain the FD sets that are passed to and from the system call.
Now days, the preferred way to do asynchronous IO is with epoll. The API is far simpler and scales very nicely across various types of load (many connections, lots of throughput, etc.)
Refer the link below which talks about separate flags to epoll to avoid this problem.
http://lwn.net/Articles/632590/
I recently saw tested a scenario where multiple threads polled on a listening unix-domain socket and then accepted the connection. All threads woke up using the poll() system call.
This was a custom build of the linux kernel rather than a distro build so perhaps there is a kernel configure option that changes it but I don't know what that would be.
We did not try epoll.