Below is a apache bench run for 10K requests with 50 concurrent threads.
I need help understanding the results, does anything stand out in the results that might be pointing to something blocking and restricting more requests per second?
I'm looking at the connection time section, and see 'waiting' and 'processing'. It shows the mean time for waiting is 208, and the mean time to connect is 0 and processing is 208..yet the total is 208. Can someone explain this to me as it doesn't make much sense to me.
Connect time is time it took ab to establish connection with your server. you are probably running it on same server or within LAN, so your connect time is 0.
Processing time is total time server took to process and send complete response.
Wait time is time between sending request and receiving 1st byte of response.
Again, since you are running on same server, and small size of file, your processing time == wait time.
For real benchmark, try ab from multiple points near your target market to get real idea of latency. Right now all the info you have is the wait time.
This question is getting old, but I've run into the same problem so I might as well contribute an answer.
You might benefit from disabling either TCP nagle on the agent side, or ACK delay on the server side. They can interact badly and cause an unwanted delay. Like me, that's probably why your minimum time is exactly 200ms.
I can't confirm, but my understanding is that the problem is cross-platform since it's part of the TCP spec. It might be just for quick connections with a small amount of data sent and received, though I've seen reports of issues for larger transfers too. Maybe somebody who knows TCP better can pitch in.
Reference:
http://en.wikipedia.org/wiki/TCP_delayed_acknowledgment#Problems
http://blogs.technet.com/b/nettracer/archive/2013/01/05/tcp-delayed-ack-combined-with-nagle-algorithm-can-badly-impact-communication-performance.aspx