Using bash process substitution, I want to run two different commands on a file simultaneously. In this example it is not necessary but imagine that "cat /usr/share/dict/words" was a very expensive operation such as uncompressing a 50gb file.
cat /usr/share/dict/words | tee >(head -1 > h.txt) >(tail -1 > t.txt) > /dev/null
After this command I would expect h.txt to contain the first line of the words file "A", and t.txt to contain the last line of the file "Zyzzogeton".
However what actually happens is that h.txt contains "A" but t.txt contains "argillaceo" which is about 5% into the file.
Why does this happen? It seems like either the "tail" process is terminating early or the streams are getting mixed up.
Running another similar command like this behaves as expected:
cat /usr/share/dict/words | tee >(grep ^a > a.txt) >(grep ^z > z.txt) > /dev/null
After this command I'd expect a.txt to contain all the words that begin with "a", while z.txt contains all of the words that begin with "z", which is exactly what happened.
So why doesn't this work with "tail", and with what other commands will this not work?
Ok, what seems to happen is that once the
head -1
command finishes it exits and that causestee
to get a SIGPIPE it tries to write to the named pipe that the process substitution setup which generates anEPIPE
and according toman 2 write
will also generateSIGPIPE
in the writing process, which causestee
to exit and that forces thetail -1
to exit immediately, and thecat
on the left gets aSIGPIPE
as well.We can see this a little better if we add a bit more to the process with
head
and make the output both more predictable and also written tostderr
without relying on thetee
:which when I run it gave me the output:
so it got just 1 more iteration of the loop before everything exited (though
t.txt
still only has1
in it). If we then didwe see
which this question ties to
SIGPIPE
in a very similar fashion to what we're seeing here.The coreutils maintainers have added this as an example to their
tee
"gotchas" for future posterity.For a discussion with the devs about how this fits into POSIX compliance you can see the (closed notabug) report at http://debbugs.gnu.org/cgi/bugreport.cgi?bug=22195
If you have access to GNU version 8.24 they have added some options (not in POSIX) that can help like
-p
or--output-error=warn
. Without that you can take a bit of a risk but get the desired functionality in the question by trapping and ignoring SIGPIPE:will have the expected results in both
h.txt
andt.txt
, but if something else happened that wanted SIGPIPE to be handled correctly you'd be out of luck with this approach.Another hacky option would be to zero out
t.txt
before starting then not let thehead
process list finish until it is non-zero length: