When you read a closed TCP socket you get a regular error, i.e. it either returns 0 indicating EOF or -1 and an error code in errno
which can be printed with perror
.
However, when you write a closed TCP socket the OS sends SIGPIPE
to your app which will terminate the app if not caught.
Why is writing the closed TCP socket worse than reading it?
+1 To Greg Hewgill for leading my thought process in the correct direction to find the answer.
The real reason for
SIGPIPE
in both sockets and pipes is the filter idiom / pattern which applies to typical I/O in Unix systems.Starting with pipes. Filter programs like grep typically write to
STDOUT
and read fromSTDIN
, which may be redirected by the shell to a pipe. For example:The shell when it forks and then exec's these programs probably uses the
dup2
system call to redirectSTDIN
,STDOUT
andSTDERR
to the appropriate pipes.Since the filter program
grep
doesn't know and has no way of knowing that it's output has been redirected then the only way to tell it to stop writing to a broken pipe ifdoSomeThingErrorProne
crashes is with a signal since return values of writes toSTDOUT
are rarely if ever checked.The analog with sockets would be the
inetd
server taking the place of the shell.As an example I assume you could turn
grep
into a network service which operates overTCP
sockets. For example withinetd
if you want to have agrep
server onTCP
port 8000 then add this to/etc/services
:Then add this to
/etc/inetd.conf
:Send
SIGHUP
toinetd
and connect to port 8000 with telnet. This should causeinetd
to fork, dup the socket ontoSTDIN
,STDOUT
andSTDERR
and then execgrep
with foo as an argument. If you start typing lines into telnetgrep
will echo those lines which contain foo.Now replace telnet with a program named
ticker
that for instance writes a stream of real time stock quotes toSTDOUT
and gets commands onSTDIN
. Someone telnets to port 8000 and types "start java" to get quotes for Sun Microsystems. Then they get up and go to lunch. telnet inexplicably crashes. If there was noSIGPIPE
to send thenticker
would keep sending quotes forever, never knowing that the process on the other end had crashed, and needlessly wasting system resources.Usually if you're writing to a socket, you would expect the other end to be listening. This is sort of like a telephone call - if you're speaking, you wouldn't expect the other party to simply hang up the call.
If you're reading from a socket, then you're expecting the other end to either (a) send you something, or (b) close the socket. Situation (b) would happen if you've just sent something like a QUIT command to the other end.
I think a large part of the answer is 'so that a socket behaves rather similarly to a classic Unix (anonymous) pipe'. Those also exhibit the same behaviour - witness the name of the signal.
So, then it is reasonable to ask why do pipes behave that way. Greg Hewgill's answer gives a summary of the situation.
Another way of looking at it is - what is the alternative? Should a 'read()' on a pipe with no writer give a SIGPIPE signal? The meaning of SIGPIPE would have to change from 'write on a pipe with noone to read it', of course, but that's trivial. There's no particular reason to think that it would be better; the EOF indication (zero bytes to read; zero bytes read) is a perfect description of the state of the pipe, and so the behaviour of read is good.
What about 'write()'? Well, an option would be to return the number of bytes written - zero. But that is not a good idea; it implies that the code should try again and maybe more bytes would be sent, which is not going to be the case. Another option would be an error - write() returns -1 and sets an appropriate errno. It isn't clear that there is one. EINVAL or EBADF are both inaccurate: the file descriptor is correct and open at this end (and should be closed after the failing write); there just isn't anything to read it. EPIPE means 'broken PIPE'; so, with a caveat about "this is a socket, not a pipe", it would be the appropriate error. It is probably the errno returned if you ignore SIGPIPE. It would be feasible to do this - just return an appropriate error when the pipe is broken (and never send the signal). However, it is an empirical fact that many programs do not pay as much attention to where their output is going, and if you pipe a command that will read a multi-gigabyte file into a process that quits after the first 20 KB, but it is not paying attention to the status of its writes, then it will take a long time to finish, and will be wasting machine effort while doing so, whereas by sending it a signal that it is not ignoring, it will stop quickly -- this is definitely advantageous. And you can get the error if you want it. So the signal sending has benefits to the o/s in the context of pipes; and sockets emulate pipes rather closely.
Interesting aside: while checking the message for SIGPIPE, I found the socket option:
Think of the socket as a big pipeline of data between the sending and the receiving process. Now imagine that the pipeline has a valve that is shut (the socket connection is closed).
If you're reading from the socket (trying to get something out of the pipe), there's no harm in trying to read something that isn't there; you just won't get any data out. In fact, you may, as you said, get an EOF, which is correct, as there's no more data to be read.
However, writing to this closed connection is another matter. Data won't go through, and you may wind up dropping some important communication on the floor. (You can't send water down a pipe with a closed valve; if you try, something will probably burst somewhere, or, at the very least, the back pressure will spray water all over the place.) That's why there's a more powerful tool to alert you to this condition, namely, the SIGPIPE signal.
You can always ignore or block the signal, but you do so at your own risk.