Background: I'm using CreateIoCompletionPort, WSASend/Recv, and GetQueuedCompletionStatus to do overlapped socket io on my server. For flow control, when sending to the client, I only allow several WSASend() to be called when all pending OVERLAPs have popped off the IOCP.
Problem: Recently, there are occassions when the OVERLAPs do not get returned to the IOCP. The thread calling GetQueuedCompletionStatus does not get them and they remain in my local pending queue. I've verified that the client DOES receive the data off the socket and the socket is connected. No errors were returned when the WSASend() calls were made. The OVERLAPs simply "never" come back without an external stimulus like the following:
- Disconnecting the socket from the client or server, immediately allows the GetQueuedCompletionStatus thread to retrieve the OVERLAPs
- Making additional calls to WSASend(), sometimes several are needed, before all the OVERLAPs suddenly pop off the queue.
Question: Has anyone seen this type of behavior? Any ideas on what is causing this?
Thanks,
Geoffrey
WSASend()
can fail to complete in a timely manner if the TCP window is full. In this case the stack can't send any more data so your WSASend()
waits and your completion doesn't occur until the TCP stack CAN send more data.
If you happen to have a protocol between your client and server that has no flow control built into the protocol itself AND you aren't doing any flow control yourself based on write completions and are just sending data as fast as your server can send then you may get to a point where either the network or your client can't keep up and TCP flow control kicks in (when the TCP window gets full). If you continue to just fire off data asynchronously with additional calls to WSASend()
then eventually you'll chew your way through all of the non-paged memory on the machine and at that point all bets are off (chances are high that a driver may cause the box to bluescreen).
So, in summary, completions from overlapped socket writes can and will sometimes take longer to come back than you may expect. In your example, I expect that the completions that you get when you close the socket are all failures?
I talk about this some more on my blog; here: http://www.lenholgate.com/blog/2008/07/write-completion-flow-control.html and here: http://www.serverframework.com/asynchronousevents/2011/06/tcp-flow-control-and-asynchronous-writes.html