I think I understand the formal meaning of the option. In some legacy code I'm handling now, the option is used. The customer complains about RST as response to FIN from its side on connection close from its side.
I am not sure I can remove it safely, since I don't understand when it should be used.
Can you please give an example of when the option would be required?
Whether you can remove the linger in your code safely or not depends on the type of your application: is it a „client“ (opening TCP connections and actively closing it first) or is it a „server“ (listening to a TCP open and closing it after the other side initiated the close)?
If your application has the flavor of a „client“ (closing first) AND you initiate & close a huge number of connections to different servers (e.g. when your app is a monitoring app supervising the reachability of a huge number of different servers) your app has the problem that all your client connections are stuck in TIME_WAIT state. Then, I would recommend to shorten the timeout to a smaller value than the default to still shutdown gracefully but free up the client connections resources earlier. I would not set the timeout to 0, as 0 does not shutdown gracefully with FIN but abortive with RST.
If your application has the flavor of a „client“ and has to fetch a huge amount of small files from the same server, you should not initiate a new TCP connection per file and end up in a huge amount of client connections in TIME_WAIT, but keep the connection open and fetch all data over the same connection. Linger option can and should be removed.
If your application is a „server“ (close second as reaction to peer‘s close), on close() your connection is shutdown gracefully and resources are freed up as you don‘t enter TIME_WAIT state. Linger should not be used. But if your sever app has a supervisory process detecting inactive open connections idleing for a long time („long“ is to be defined) you can shutdown this inactive connection from your side - see it as kind of error handling - with an abortive shutdown. This is done by setting linger timeout to 0. close() will then send a RST to the client, telling him that you are angry :-)
The typical reason to set a
SO_LINGER
timeout of zero is to avoid large numbers of connections sitting in theTIME_WAIT
state, tying up all the available resources on a server.When a TCP connection is closed cleanly, the end that initiated the close ("active close") ends up with the connection sitting in
TIME_WAIT
for several minutes. So if your protocol is one where the server initiates the connection close, and involves very large numbers of short-lived connections, then it might be susceptible to this problem.This isn't a good idea, though -
TIME_WAIT
exists for a reason (to ensure that stray packets from old connections don't interfere with new connections). It's a better idea to redesign your protocol to one where the client initiates the connection close, if possible.For my suggestion, please read the last section: “When to use SO_LINGER with timeout 0”.
Before we come to that a little lecture about:
TIME_WAIT
FIN
,ACK
andRST
Normal TCP termination
The normal TCP termination sequence looks like this (simplified):
We have two peers: A and B
close()
FIN
to BFIN_WAIT_1
stateFIN
ACK
to ACLOSE_WAIT
stateACK
FIN_WAIT_2
stateclose()
FIN
to ALAST_ACK
stateFIN
ACK
to BTIME_WAIT
stateACK
CLOSED
state – i.e. is removed from the socket tablesTIME_WAIT
So the peer that initiates the termination – i.e. calls
close()
first – will end up in theTIME_WAIT
state.To understand why the
TIME_WAIT
state is our friend, please read section 2.7 in "UNIX Network Programming" third edition by Stevens et al (page 43).However, it can be a problem with lots of sockets in
TIME_WAIT
state on a server as it could eventually prevent new connections from being accepted.To work around this problem, I have seen many suggesting to set the SO_LINGER socket option with timeout 0 before calling
close()
. However, this is a bad solution as it causes the TCP connection to be terminated with an error.Instead, design your application protocol so the connection termination is always initiated from the client side. If the client always knows when it has read all remaining data it can initiate the termination sequence. As an example, a browser knows from the
Content-Length
HTTP header when it has read all data and can initiate the close. (I know that in HTTP 1.1 it will keep it open for a while for a possible reuse, and then close it.)If the server needs to close the connection, design the application protocol so the server asks the client to call
close()
.When to use SO_LINGER with timeout 0
Again, according to "UNIX Network Programming" third edition page 202-203, setting
SO_LINGER
with timeout 0 prior to callingclose()
will cause the normal termination sequence not to be initiated.Instead, the peer setting this option and calling
close()
will send aRST
(connection reset) which indicates an error condition and this is how it will be perceived at the other end. You will typically see errors like "Connection reset by peer".Therefore, in the normal situation it is a really bad idea to set
SO_LINGER
with timeout 0 prior to callingclose()
– from now on called abortive close – in a server application.However, certain situation warrants doing so anyway:
CLOSE_WAIT
or ending up in theTIME_WAIT
state.TIME_WAIT
(when callingclose()
from the server end) as this might prevent the server from getting available ports for new client connections after being restarted.CLOSE_WAIT
trying to deliver data to a stuck terminal port, but would properly reset the stuck port if it got anRST
to discard the pending data."I would recommend this long article which I believe gives a very good answer to your question.
When linger is on but the timeout is zero the TCP stack doesn't wait for pending data to be sent before closing the connection. Data could be lost due to this but by setting linger this way you're accepting this and asking that the connection be reset straight away rather than closed gracefully. This causes an RST to be sent rather than the usual FIN.
Thanks to EJP for his comment, see here for details.