How does the POSIX 'tee' command work?

2019-05-07 10:59发布

tee newOutputFile < existingInputFile > newOutputFile2

How exactly will tee take in the arguments? Would it be like this?

  1. Tee will first process newOutputFile < existingInputFile So the contents of existingInputFile will be written into newOutputFile
  2. newOutputFile > newOutputFile2 So the contents of newOutputFile will be written into newOutputFile 2

I am trying to write a shell that process this particular command. However, I am confused as to which order to pass in the arguments to tee. The way I have coded my program up it will do

tee newOutputFile2 < existingInputFIle

标签: shell posix tee
1条回答
啃猪蹄的小仙女
2楼-- · 2019-05-07 11:53

The tee command is a regular Unix program, just like sh or sort or cat.

All the I/O redirection work involved in handling < existingInputFile and > newOutputFile2 is done by the shell before the tee command is invoked (after the fork that creates the process that will execute the tee command). The command is invoked with its standard input coming from existingInputFile and with its standard output going to newOutputFile2. The only arguments given to tee are argv[0] (the string tee) and argv[1] (the string newOutputFile), plus a null pointer to mark the end of the argument list.

Note specifically that the shell is not involved in the actual reading of existingInputFile; it just opens it for reading and connects it to the standard input of tee, but has no knowledge of whether the tee command actually reads it or not. Similarly, the shell is not involved in the actual writing to newOutputFile2; it just opens and truncates it (or creates it) and connects it to the standard output of tee, but has no knowledge of whether the tee command actually writes anything to it. In this context, while the tee command is running, the parent shell is completely passive, doing no I/O.

By design, tee reads its standard input and write one copy of everything to each of the files given in its argument list and one more copy to standard output.


I was under the impression that the shell was involved in the actual reading and writing of the files. So when I call execvp, it only takes in the command (in this case tee) and the final file to write the contents to (in this case newOutputFile2). I am trying to create my own shell program, how would I do the I/O redirection. Is this where dup2 comes into play?

The shell is only involved in opening and closing, but not in the reading and writing, of the files. In your command line tee newOutputFile < existingInputFile > newOutputFile2, the command is tee and the only other argument is newOutputFile. In general, the command (tee in this case) has no knowledge of the name of the file that is providing it with standard input, nor of the name of the file that it is writing to on its standard output. Indeed, especially with tee, the input is most often a pipe rather than a file, and very often the output is also a pipe rather than a file:

some_command arg1 arg2 | tee some_command.log | another_command its_arg1 its_arg2 > output.file

In your own shell program, you could use dup2() to duplicate the file descriptor you'd opened separately so that it becomes standard input:

// Redirect standard input from existingInputFile using dup2()
char *i_filename = "existingInputFile";
int fd = open(i_filename, O_RDONLY);
if (fd < 0)
{
    fprintf(stderr, "unable to open file %s for reading (%d: %s)\n",
            i_filename, errno, strerror(errno));
    exit(1);
}
dup2(fd, STDIN_FILENO);
close(fd);  // Crucial!

Note that it is important to close fd in this scenario. Otherwise, the command is run with at least one extra file descriptor open that was not specified in the command line. You'd have a similar block of code for the standard output redirection.

Or you could use:

// Redirect standard input from existingInputFile
close(0);
char *i_filename = "existingInputFile";
int fd = open(i_filename, O_RDONLY);
if (fd < 0)
{
    fprintf(stderr, "unable to open file %s for reading (%d: %s)\n",
            i_filename, errno, strerror(errno));
    exit(1);
}
assert(fd == 0);

// Redirect standard output to NewOutputFile2
close(1);
char * o_filename = "newOutputFile2";
fd = open(o_filename, O_WRONLY|O_CREAT|O_TRUNC, 0644); // Classically 0666
if (fd < 0)
{
    fprintf(stderr, "unable to open file %s for writing (%d: %s)\n",
            o_filename, errno, strerror(errno));
    exit(1);
}
assert(fd == 1);

This is because open() returns the lowest available previously not open file descriptor, so by closing 0, you know that open() will return 0 on success and -1 on failure (even if 0 was previously closed). Then, by induction, you know that after closing 1, open() will return 1 on success and -1 on failure (even if 1 was previously closed). You don't normally tinker with standard error unless the command line includes I/O redirection such as 2>/dev/null or 2>&1 or something similar.

If you prefer, you can write 0644 as:

O_IRUSR|O_IWUSR|O_IRGRP|O_IROTH

(and add |O_IWGRP|O_IWOTH if you want to go with group and other write permission (0666); the permissions will be modified by the umask anyway). Personally, I find the octal a lot easier to read, but I started using octal permissions a number of years before the O_Ixyyy names were invented.

查看更多
登录 后发表回答