tee newOutputFile < existingInputFile > newOutputFile2
How exactly will tee
take in the arguments? Would it be like this?
- Tee will first process
newOutputFile < existingInputFile
So the contents of existingInputFile will be written into newOutputFile
newOutputFile > newOutputFile2
So the contents of newOutputFile will be written into newOutputFile 2
I am trying to write a shell that process this particular command. However, I am confused as to which order to pass in the arguments to tee
. The way I have coded my program up it will do
tee newOutputFile2 < existingInputFIle
The tee
command is a regular Unix program, just like sh
or sort
or cat
.
All the I/O redirection work involved in handling < existingInputFile
and > newOutputFile2
is done by the shell before the tee
command is invoked (after the fork
that creates the process that will execute the tee
command). The command is invoked with its standard input coming from existingInputFile
and with its standard output going to newOutputFile2
. The only arguments given to tee
are argv[0]
(the string tee
) and argv[1]
(the string newOutputFile
), plus a null pointer to mark the end of the argument list.
Note specifically that the shell is not involved in the actual reading of existingInputFile
; it just opens it for reading and connects it to the standard input of tee
, but has no knowledge of whether the tee
command actually reads it or not. Similarly, the shell is not involved in the actual writing to newOutputFile2
; it just opens and truncates it (or creates it) and connects it to the standard output of tee
, but has no knowledge of whether the tee
command actually writes anything to it. In this context, while the tee
command is running, the parent shell is completely passive, doing no I/O.
By design, tee
reads its standard input and write one copy of everything to each of the files given in its argument list and one more copy to standard output.
I was under the impression that the shell was involved in the actual reading and writing of the files. So when I call execvp
, it only takes in the command (in this case tee
) and the final file to write the contents to (in this case newOutputFile2
). I am trying to create my own shell program, how would I do the I/O redirection. Is this where dup2
comes into play?
The shell is only involved in opening and closing, but not in the reading and writing, of the files. In your command line tee newOutputFile < existingInputFile > newOutputFile2
, the command is tee
and the only other argument is newOutputFile
. In general, the command (tee
in this case) has no knowledge of the name of the file that is providing it with standard input, nor of the name of the file that it is writing to on its standard output. Indeed, especially with tee
, the input is most often a pipe rather than a file, and very often the output is also a pipe rather than a file:
some_command arg1 arg2 | tee some_command.log | another_command its_arg1 its_arg2 > output.file
In your own shell program, you could use dup2()
to duplicate the file descriptor you'd opened separately so that it becomes standard input:
// Redirect standard input from existingInputFile using dup2()
char *i_filename = "existingInputFile";
int fd = open(i_filename, O_RDONLY);
if (fd < 0)
{
fprintf(stderr, "unable to open file %s for reading (%d: %s)\n",
i_filename, errno, strerror(errno));
exit(1);
}
dup2(fd, STDIN_FILENO);
close(fd); // Crucial!
Note that it is important to close fd
in this scenario. Otherwise, the command is run with at least one extra file descriptor open that was not specified in the command line. You'd have a similar block of code for the standard output redirection.
Or you could use:
// Redirect standard input from existingInputFile
close(0);
char *i_filename = "existingInputFile";
int fd = open(i_filename, O_RDONLY);
if (fd < 0)
{
fprintf(stderr, "unable to open file %s for reading (%d: %s)\n",
i_filename, errno, strerror(errno));
exit(1);
}
assert(fd == 0);
// Redirect standard output to NewOutputFile2
close(1);
char * o_filename = "newOutputFile2";
fd = open(o_filename, O_WRONLY|O_CREAT|O_TRUNC, 0644); // Classically 0666
if (fd < 0)
{
fprintf(stderr, "unable to open file %s for writing (%d: %s)\n",
o_filename, errno, strerror(errno));
exit(1);
}
assert(fd == 1);
This is because open()
returns the lowest available previously not open file descriptor, so by closing 0, you know that open()
will return 0 on success and -1 on failure (even if 0 was previously closed). Then, by induction, you know that after closing 1, open()
will return 1 on success and -1 on failure (even if 1 was previously closed). You don't normally tinker with standard error unless the command line includes I/O redirection such as 2>/dev/null
or 2>&1
or something similar.
If you prefer, you can write 0644 as:
O_IRUSR|O_IWUSR|O_IRGRP|O_IROTH
(and add |O_IWGRP|O_IWOTH
if you want to go with group and other write permission (0666); the permissions will be modified by the umask
anyway). Personally, I find the octal a lot easier to read, but I started using octal permissions a number of years before the O_Ixyyy
names were invented.