Useless use of cat?

2018-12-31 02:24发布

This is probably in many FAQs - instead of using:

cat file | command

(which is called useless use of cat), correct way supposed to be:

command < file

In the 2nd, "correct" way - OS does not have to spawn an extra process.
Despite knowing that, I continued to use useless cat for 2 reasons.

  1. more aesthetic - I like when data moves uniformly only from left to right. And it easier to replace cat with something else (gzcat, echo, ...), add a 2nd file or insert new filter (pv, mbuffer, grep ...).

  2. I "felt" that it might be faster in some cases. Faster because there are 2 processes, 1st (cat) does the reading and the second does whatever. And they can run in parallel, which means sometimes faster execution.

Is my logic correct (for 2nd reason)?

8条回答
千与千寻千般痛.
2楼-- · 2018-12-31 02:57

I think that (the traditional way) using pipe is a bit more faster; on my box I used strace command to see what's going on:

Without pipe:

toc@UnixServer:~$ strace wc -l < wrong_output.c
execve("/usr/bin/wc", ["wc", "-l"], [/* 18 vars */]) = 0
brk(0)                                  = 0x8b50000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb77ad000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=29107, ...}) = 0
mmap2(NULL, 29107, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb77a5000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0p\222\1\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1552584, ...}) = 0
mmap2(NULL, 1563160, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7627000
mmap2(0xb779f000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x178) = 0xb779f000
mmap2(0xb77a2000, 10776, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb77a2000
close(3)                                = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7626000
set_thread_area({entry_number:-1 -> 6, base_addr:0xb76268d0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
mprotect(0xb779f000, 8192, PROT_READ)   = 0
mprotect(0x804f000, 4096, PROT_READ)    = 0
mprotect(0xb77ce000, 4096, PROT_READ)   = 0
munmap(0xb77a5000, 29107)               = 0
brk(0)                                  = 0x8b50000
brk(0x8b71000)                          = 0x8b71000
open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=5540198, ...}) = 0
mmap2(NULL, 2097152, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7426000
mmap2(NULL, 1507328, PROT_READ, MAP_PRIVATE, 3, 0x2a8) = 0xb72b6000
close(3)                                = 0
open("/usr/share/locale/locale.alias", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=2570, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb77ac000
read(3, "# Locale name alias data base.\n#"..., 4096) = 2570
read(3, "", 4096)                       = 0
close(3)                                = 0
munmap(0xb77ac000, 4096)                = 0
open("/usr/share/locale/fr_FR.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/fr_FR.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/fr_FR/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/fr.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/fr.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/fr/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale-langpack/fr_FR.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale-langpack/fr_FR.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale-langpack/fr_FR/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale-langpack/fr.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale-langpack/fr.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale-langpack/fr/LC_MESSAGES/coreutils.mo", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=316721, ...}) = 0
mmap2(NULL, 316721, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7268000
close(3)                                = 0
open("/usr/lib/i386-linux-gnu/gconv/gconv-modules.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=26064, ...}) = 0
mmap2(NULL, 26064, PROT_READ, MAP_SHARED, 3, 0) = 0xb7261000
close(3)                                = 0
read(0, "#include<stdio.h>\n\nint main(int "..., 16384) = 180
read(0, "", 16384)                      = 0
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7260000
write(1, "13\n", 313
)                     = 3
close(0)                                = 0
close(1)                                = 0
munmap(0xb7260000, 4096)                = 0
close(2)                                = 0
exit_group(0)                           = ?

And with pipe:

toc@UnixServer:~$ strace cat wrong_output.c | wc -l
execve("/bin/cat", ["cat", "wrong_output.c"], [/* 18 vars */]) = 0
brk(0)                                  = 0xa017000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb774b000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=29107, ...}) = 0
mmap2(NULL, 29107, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7743000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0p\222\1\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1552584, ...}) = 0
mmap2(NULL, 1563160, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb75c5000
mmap2(0xb773d000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x178) = 0xb773d000
mmap2(0xb7740000, 10776, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7740000
close(3)                                = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75c4000
set_thread_area({entry_number:-1 -> 6, base_addr:0xb75c48d0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
mprotect(0xb773d000, 8192, PROT_READ)   = 0
mprotect(0x8051000, 4096, PROT_READ)    = 0
mprotect(0xb776c000, 4096, PROT_READ)   = 0
munmap(0xb7743000, 29107)               = 0
brk(0)                                  = 0xa017000
brk(0xa038000)                          = 0xa038000
open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=5540198, ...}) = 0
mmap2(NULL, 2097152, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb73c4000
mmap2(NULL, 1507328, PROT_READ, MAP_PRIVATE, 3, 0x2a8) = 0xb7254000
close(3)                                = 0
fstat64(1, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
open("wrong_output.c", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0664, st_size=180, ...}) = 0
read(3, "#include<stdio.h>\n\nint main(int "..., 32768) = 180
write(1, "#include<stdio.h>\n\nint main(int "..., 180) = 180
read(3, "", 32768)                      = 0
close(3)                                = 0
close(1)                                = 0
close(2)                                = 0
exit_group(0)                           = ?
13

You can do some testing with strace and time command with more and longer commands for good benchmarking.

查看更多
几人难应
3楼-- · 2018-12-31 03:01

In defense of cat:

Yes,

   < input process > output 

or

   process < input > output 

is more efficient, but many invocations don't have performance issues, so you don't care.

ergonomic reasons:

We are used to read from left to right, so a command like

    cat infile | process1 | process2 > outfile

is trivial to understand.

    process1 < infile | process2 > outfile

has to jump over process1, and then read left to right. This can be healed by:

    < infile process1 | process2 > outfile

looks somehow, as if there were an arrow pointing to the left, where nothing is. More confusing and looking like fancy quoting is:

    process1 > outfile < infile

and generating scripts is often an iterative process,

    cat file 
    cat file | process1
    cat file | process1 | process2 
    cat file | process1 | process2 > outfile

where you see your progress stepwise, while

    < file 

not even works. Simple ways are less error prone and ergonomic command catenation is simple with cat.

Another topic is, that most people were exposed to > and < as comparison operators, long before using a computer and when using a computer as programmers, are far more often exposed to these as such.

And comparing two operands with < and > is contra commutative, which means

(a > b) == (b < a)

I remember the first time using < for input redirection, I feared

a.sh < file 

could mean the same as

file > a.sh

and somehow overwrite my a.sh script. Maybe this is an issue for many beginners.

rare differences

wc -c journal.txt
15666 journal.txt
cat journal.txt | wc -c 
15666

The latter can be used in calculations directly.

factor $(cat journal.txt | wc -c)

Of course the < can be used here too, instead of a file parameter:

< journal.txt wc -c 
15666
wc -c < journal.txt
15666

but who cares - 15k?

If I would run occasionally into issues, surely I would change my habit of invocing cat.

When using very large or many, many files, avoiding cat is fine. To most questions the use of cat is orthogonal, off topic, not an issue.

Starting these useless useless use of cat discussion on every second shell topic is only annoying and boring. Get a life and wait for your minute of fame, when dealing with performance questions.

查看更多
登录 后发表回答