I have written a multiclient Server program in C on SuSE Linux Enterprise Server 12.3 (x86_64), I am using one thread per client to receive data.
My problem is:
I am using one terminal to run the server, and using several other terminals to telnet
to my server (as client). I have used recv()
in the server to receive data from client, I have also applied checks for return value of recv()
i.e. Error on -1
; Conn. Closed on 0
& Normal operation else. I have not used any flags in recv()
.
My program works fine if I just close the telnet session (i.e. disconnect client) normally using Ctrl+]
and close
, but if I forcefully terminate the client using kill <pid>
then my server is unable to detect loss of connection.
How to fix that?
Constraint: I do not want to put condition on client side, I want to fix this on server side only.
You can enable SO_KEEPALIVE
on the socket in your server.
/* enable keep-alive on the socket */
int one = 1;
setsockopt(sock, SOL_SOCKET, SO_KEEPALIVE, &one, sizeof(one));
By default, when keep-alive is enabled, the connection has to be idle for 2 hours before a keep-alive probe is attempted. You can adjust the keep-alive times to be a little more aggressive by adjusting the TCP_KEEPIDLE
parameter:
int idletime = 120; /* in seconds */
setsockopt(sock, IPPROTO_TCP, TCP_KEEPIDLE, &idletime, sizeof(idletime));
When a probe is sent, it expects an acknowledgement from the other end. If there is an acknowledgement, the probe stays silent until the idle timer expires again. The keep-alive probe is retried again, by default every 75 seconds, if no acknowledgement to the probe is received. This can be adjusted with the TCP_KEEPINTVL
option. The TCP_KEEPCNT
option controls how many successive failures triggers the connection to be dropped. By default, that number is 9.
These options are available on Linux. BSD has similar options, but are named differently.
About all you'd be able to do is implement a timeout of some sort. You won't be able to determine for certain that the client has disconnected unless it actually does the disconnect itself. The closest you'll get is noticing that the client was required to send something and failed to do so in a timely manner.
As for why: TCP is just a layer over top of IP. There's nothing actually connecting the two computers; a "connection" is simply an acknowledgement that another machine exists and has agreed to exchange info with you using TCP. The "connection" abstraction only holds as long as both sides act according to the rules. Forcefully killing the client makes it unable to hold up its end of the deal, so the server is left hanging.
My program works fine if I just close the telnet session (i.e. disconnect client) normally using Ctrl+] and close, but if I forcefully terminate the client using kill or closing the terminal, then my server is unable to detect loss of connection.
In either case the client socket gets closed either by telnet or by the kernel when it destroys telnet process. Your server must receive FIN
segment which causes recv()
return 0 (after all pending data has been read from the socket).
You are probably not processing all returns codes from recv()
correctly.