$ cat script.py
import sys
for line in sys.stdin:
sys.stdout.write(line)
sys.stdout.flush()
$ cat script.py - | python -u script.py
The output is right but it only starts printing once I hit Ctrl-D whereas the following starts printing right away :
$ cat script.py - | cat
which led me to think that the buffering does not come from cat.
I managed to get it working by doing :
for line in iter(sys.stdin.readline, ""):
as explained here : Streaming pipes in Python, but I don't understand why the former solution doesn't work as expected.
Python manpage reveals the answer to your question:
-u Force stdin, stdout and stderr to be totally unbuffered. On systems where it matters, also put stdin, stdout and stderr in binary mode. Note that
there is internal buffering in xreadlines(), readlines() and file-object iterators ("for line in sys.stdin") which is not influenced by this
option. To work around this, you will want to use "sys.stdin.readline()" inside a "while 1:" loop.
That is: file-object iterators' internal buffering is to blame (and it doesn't go away with -u).
cat does block buffering by default if output is to a pipe. So when you include - (stdin) in the cat command, it waits to get EOF (your ctrl-D closes the stdin stream) or 8K (probably) of data before outputting anything.
If you change the cat command to "cat script.py |" you'll see that it works as you expected.
Also, if you add 8K of comments to the end of script.py, it will immediately print it as well.
Edit:
The above is wrong. :-)
It turns out that file.next() (used by file iterators, ie. for line in file) has a hidden read-ahead buffer that is not used by readline(), which simply reads a character until it sees a newline or EOF.