Consider something like:
cat file | command > file
Is this good practice? Could this overwrite the input file as the same time as we are reading it, or is it always read first in memory then piped to second command?
Obviously, I can use temp files as intermediary step, but I'm just wondering..
t=$(mktemp)
cat file | command > ${t} && mv ${t} file
No, it is not ok. All commands in a pipeline execute at the same time, and the shell prepares redirections before executing the commands. So, it is likely that the command will overwrite the file before cat reads it.
You need sponge(1) from moreutils.
You can also use something like this (not recommended, use explicit temp files in production code):
{ rm file && your_command > file; } < file
Not only should you NOT write your output to your input, but also you should avoid looping your output back to your input.
When dealing with big files, I tried
cat *allfastq30 > Sample_All_allfastq30
and it generated error messages:
cat: Sample_All_allfastq30: input file is output file