I am working on a pipeline that has a few branch points that subsequently merge-- they look something like this:
command2
/ \
command1 command4
\ /
command3
Each command writes to STDOUT and accepts input via STDIN. STDOUT from command1 needs to be passed to both command2 and command3, which are run sequentially, and their output needs to be effectively concatenated and passed to command4. I initially thought that something like this would work:
$ command1 | (command2; command3) | command4
That doesn't work though, as only STDOUT from command2 is passed to command 4, and when I remove command4 it's apparent that command3 isn't being passed the appropriate stream from command1 -- in other words, it's as if command2 is exhausting or consuming the stream. I get the same result with { command2 ; command3 ; } in the middle as well. So I figured I should be using 'tee' with process substitution, and tried this:
$ command1 | tee >(command2) | command3 | command4
But surprisingly that didn't work either -- it appears that the output of command1 and the output of command2 are piped into command3, which results in errors and only the output of command3 being piped into command4. I did find that the following gets the appropriate input and output to and from command2 and command3:
$ command1 | tee >(command2) >(command3) | command4
However, this streams the output of command1 to command4 as well, which leads to issues as command2 and command3 produce a different specification than command1. The solution I've arrived on seems hacky, but it does work:
$ command1 | tee >(command2) >(command3) > /dev/null | command4
That suppresses command1 passing its output to command4, while collecting STDOUT from command2 and command3. It works, but I feel like I'm missing a more obvious solution. Am I? I've read dozens of threads and haven't found a solution to this problem that works in my use case, nor have I seen an elaboration of the exact problem of splitting and re-joining streams (though I can't be the first one to deal with this). Should I just be using named pipes? I tried but had difficulty getting that working as well, so maybe that's another story for another thread. I'm using bash in RHEL5.8.