Executing a command-line pipeline in C with fork()

2019-03-06 03:49发布

问题:

I asked about my code and the answer was that it is incorrect. perror usage in this case Now I wonder how I can adjust and improve so that I no longer will have these errors?

execvp does not return, except if an error occurs, so if all works, the enclosing function will never return.

the value 'i' is already past the end of the array in 'cmd' due to the prior loop, so 'cmd[i].argv[0] is not correct.

cmd is not an array, of struct command, so should not be indexed

the first entry in cmd.argv is a pointer to an array where the last entry is NULL. the execvp will work on that (and only that) array so all other pointers to arrays will be ignored

There are a large number of errors in the code. for instance, the first time through the loop in fork_pipe() 'in' contains garbage. the second parameter passed to execvp() needs to be a pointer to character strings, with a final NULL pointer. That final NULL pointer is missing, There are plenty more problems

#include <sys/types.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
struct command
{
    const char **argv;
};
/* Helper function that spawns processes */
int spawn_proc (int in, int out, struct command *cmd) {
    pid_t pid;
    if ((pid = fork ()) == 0) {
        if (in != 0) {
            /*if (dup2(in, 0) == -1) {
                perror("dup2 failed");
                exit(1);
            }*/
            dup2 (in, 0);
            close (in);
        }
        if (out != 1) {
            dup2 (out, 1);
            close (out);
        }
        if (execvp(cmd->argv [0], (char * const *)cmd->argv) < 0) {
            perror("execvp failed");
            exit(1);
        }
    } else if (pid < 0) {
        perror("fork failed");
        exit(1);
    }
    return pid;
}
/* Helper function that forks pipes */
int fork_pipes (int n, struct command *cmd) {
    int i;
    int in, fd [2];
    for (i = 0; i < n - 1; ++i) {
        pipe (fd);
        spawn_proc (in, fd [1], cmd + i);
        close (fd [1]);
        in = fd [0];
    }
    dup2 (in, 0);
    /*return execvp (cmd [i].argv [0], (char * const *)cmd [i].argv);*/
    if (execvp (cmd [i].argv [0], (char * const *)cmd [i].argv) < 0) {
        perror("execvp failed");
        exit(1);
    } else {
        return execvp (cmd [i].argv [0], (char * const *)cmd [i].argv);
    }
}

int main (int argc, char ** argv) {
    int i;
    if (argc == 1) { /* There were no arguments */
        const char *printenv[] = { "printenv", 0};
        const char *sort[] = { "sort", 0 };
        const char *less[] = { "less", 0 };
        struct command cmd [] = { {printenv}, {sort}, {less} };
        return fork_pipes (3, cmd);
    }
    if (argc > 1) { /* I'd like an argument */

        if (strncmp(argv[1], "cd", 2) && strncmp(argv[1], "exit", 2)) {
            char *tmp;
            int len = 1;
            for( i=1; i<argc; i++)
            {
                len += strlen(argv[i]) + 2;
            }
            tmp = (char*) malloc(len);
            tmp[0] = '\0';
            int pos = 0;
            for( i=1; i<argc; i++)
            {
                pos += sprintf(tmp+pos, "%s%s", (i==1?"":"|"), argv[i]);
            }
            const char *printenv[] = { "printenv", 0};
            const char *grep[] = { "grep", "-E", tmp, NULL};
            const char *sort[] = { "sort", 0 };
            const char *less[] = { "less", 0 };
            struct command cmd [] = { {printenv}, {grep}, {sort}, {less} };
            return fork_pipes (4, cmd);
            free(tmp);
        } else if (! strncmp(argv[1], "cd", 2)) { /* change directory */
            printf("change directory to %s\n" , argv[2]);
            chdir(argv[2]);
        } else if (! strncmp(argv[1], "exit", 2)) { /* change directory */
            printf("exit\n");
            exit(0);
        }
    }
    exit(0);
}

回答1:

Transferring comments into (part of) an answer.

  • You don't need the test on execvp() (if it returns, it failed), but you do need the error reporting and exit call after it.

  • The 'cmd is not an array' comment seems to be bogus; inside fork_pipes(), it is an array. It is not used as an array inside spawn_proc().

  • I think the comment 'The first entry in cmd.argv is a pointer to an array where the last entry is NULL. The execvp will work on that (and only that) array so all other pointers to arrays will be ignored' is bogus too. I think they overlooked that you're creating an array of struct command's.

  • I think the comment 'the value i is already past the end of the array in cmd due to the prior loop, so cmd[i].argv[0] is not correct' is incorrect because the loop is for (i = 0; i < n - 1; i++) so i is n-1 after the loop, and the array cmd has elements 0..n-1 to address.

  • However, the value of in in the first call to spawn_proc() is indeed garbage. Probably you can simply set it to 0 (STDIN_FILENO) and be OK, but you need to verify that. But the comment about the second argument to execvp() is peculiar — the cast should be absent, but otherwise the code looks OK to me. I should add that I've not yet run a compiler over any of this, so anything I've said so far stands to be corrected by a compiler. But I'm not doing the analysis casually either… Are you compiling with your compiler set fussy: gcc -Wall -Wextra -Werror as a minimum (I use more options!)?

  • In fork_pipes(), the if test on execvp() and the else clause are weird. You just need calls to execvp(), perror() and exit().


These comments above stand basically accurate. Here is some modified code, but the modifications are mostly cosmetic. The err_syserr() function is based on what I use for reporting errors. It is a varargs function that also reports the system error. It is better than perror() because (a) it can format more comprehensively and (b) it exits.

I was getting compilation warnings like:

ft13.c: In function ‘spawn_proc’:
ft13.c:45:9: error: passing argument 2 of ‘execvp’ from incompatible pointer type [-Werror]
         execvp(cmd->argv[0], cmd->argv);
         ^
In file included from ft13.c:6:0:
/usr/include/unistd.h:440:6: note: expected ‘char * const*’ but argument is of type ‘const char **’
 int  execvp(const char *, char * const *);

The easiest way to fix these is to place the const in the correct place in struct command and to remove the const from the argument lists for the various commands.

Other changes are more cosmetic than really substantive (the uninitialized in was the only serious bug to fix). I've use my error reporting code, and checked some extra system calls (the dup2() ones, for example), and cleaned up the execvp() and error reporting. I moved the tests for exit and cd ahead of the general code to avoid repeating the tests. Also, you were using strncmp() and the test for exit was only looking at ex, but ex is a system command… Use strcmp(). I use strcmp(x, y) == 0 in the condition; the relational operator in use mimics the relational operation I'm testing (so strcmp(x, y) >= 0 tests for x greater than or equal to y, etc.).

Modern POSIX does not require #include <sys/types.h> as an include. The other headers include it as necessary.

Source: ft13.c

Compiled to ft13.

#include <errno.h>
#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

struct command
{
    char * const *argv;
};

static _Noreturn void err_syserr(char *fmt, ...)
{
    int errnum = errno;
    va_list args;
    va_start(args, fmt);
    vfprintf(stderr, fmt, args);
    va_end(args);
    if (errnum != 0)
        fprintf(stderr, "(%d: %s)\n", errnum, strerror(errnum));
    exit(EXIT_FAILURE);
}

/* Helper function that spawns processes */
static int spawn_proc(int in, int out, struct command *cmd)
{
    pid_t pid;
    if ((pid = fork()) == 0)
    {
        if (in != 0)
        {
            if (dup2(in, 0) < 0)
                err_syserr("dup2() failed on stdin for %s: ", cmd->argv[0]);
            close(in);
        }
        if (out != 1)
        {
            if (dup2(out, 1) < 0)
                err_syserr("dup2() failed on stdout for %s: ", cmd->argv[0]);
            close(out);
        }
        fprintf(stderr, "%d: executing %s\n", (int)getpid(), cmd->argv[0]);
        execvp(cmd->argv[0], cmd->argv);
        err_syserr("failed to execute %s: ", cmd->argv[0]);
    }
    else if (pid < 0)
        err_syserr("fork failed: ");
    return pid;
}

/* Helper function that forks pipes */
static void fork_pipes(int n, struct command *cmd)
{
    int i;
    int in = 0;
    int fd[2];
    for (i = 0; i < n - 1; ++i)
    {
        pipe(fd);
        spawn_proc(in, fd[1], cmd + i);
        close(fd[1]);
        in = fd[0];
    }
    if (dup2(in, 0) < 0)
        err_syserr("dup2() failed on stdin for %s: ", cmd[i].argv[0]);
    fprintf(stderr, "%d: executing %s\n", (int)getpid(), cmd[i].argv[0]);
    execvp(cmd[i].argv[0], cmd[i].argv);
    err_syserr("failed to execute %s: ", cmd[i].argv[0]);
}

int main(int argc, char **argv)
{
    int i;
    if (argc == 1)   /* There were no arguments */
    {
        char *printenv[] = { "printenv", 0};
        char *sort[] = { "sort", 0 };
        char *less[] = { "less", 0 };
        struct command cmd[] = { {printenv}, {sort}, {less} };
        fork_pipes(3, cmd);
    }
    else
    {
        if (strcmp(argv[1], "cd") == 0)      /* change directory */
        {
            printf("change directory to %s\n", argv[2]);
            chdir(argv[2]);
        }
        else if (strcmp(argv[1], "exit") == 0)
        {
            printf("exit\n");
            exit(0);
        }
        else
        {
            char *tmp;
            int len = 1;
            for (i = 1; i < argc; i++)
            {
                len += strlen(argv[i]) + 2;
            }
            tmp = (char *) malloc(len);
            tmp[0] = '\0';
            int pos = 0;
            for (i = 1; i < argc; i++)
            {
                pos += sprintf(tmp + pos, "%s%s", (i == 1 ? "" : "|"), argv[i]);
            }
            char *printenv[] = { "printenv", 0};
            char *grep[] = { "grep", "-E", tmp, NULL};
            char *sort[] = { "sort", 0 };
            char *less[] = { "less", 0 };
            struct command cmd[] = { {printenv}, {grep}, {sort}, {less} };
            fork_pipes(4, cmd);
            free(tmp);
        }
    }
    return(0);
}

Example runs

Sample 1:

$ ./ft13 | cat
1733: executing less
1735: executing printenv
1736: executing sort
Apple_PubSub_Socket_Render=/private/tmp/com.apple.launchd.sl7NmyZPgI/Render
BASH_ENV=/Users/jleffler/.bashrc
CDPATH=:/Users/jleffler:/Users/jleffler/src:/Users/jleffler/src/perl:/Users/jleffler/src/sqltools:/Users/jleffler/lib:/Users/jleffler/doc:/Users/jleffler/work:/Users/jleffler/ids
CLICOLOR=1
…lots of environment omitted…
VISUAL=vim
XPC_FLAGS=0x0
XPC_SERVICE_NAME=0
_=./ft13
__CF_USER_TEXT_ENCODING=0x1F7:0x0:0x0
$

Sample 2:

$ ./ft13 PATH | cat
1739: executing printenv
1737: executing less
1740: executing grep
1741: executing sort
CDPATH=:/Users/jleffler:/Users/jleffler/src:/Users/jleffler/src/perl:/Users/jleffler/src/sqltools:/Users/jleffler/lib:/Users/jleffler/doc:/Users/jleffler/work:/Users/jleffler/ids
DYLD_LIBRARY_PATH=/usr/lib:/usr/informix/11.70.FC6/lib:/usr/informix/11.70.FC6/lib/esql:/usr/informix/11.70.FC6/lib/cli
GOPATH=/Users/jleffler/Software/go-1.2
LD_LIBRARY_PATH=/usr/lib:/usr/gnu/lib:/usr/gcc/v4.9.1/lib
MANPATH=/Users/jleffler/man:/Users/jleffler/share/man:/usr/local/mysql/man:/usr/gcc/v4.9.1/share/man:/Users/jleffler/perl/v5.20.1/man:/usr/local/man:/usr/local/share/man:/opt/local/man:/opt/local/share/man:/usr/share/man:/usr/gnu/man:/usr/gnu/share/man
PATH=/Users/jleffler/bin:/usr/informix/11.70.FC6/bin:.:/usr/local/mysql/bin:/usr/gcc/v4.9.1/bin:/Users/jleffler/perl/v5.20.1/bin:/usr/local/go/bin:/Users/jleffler/Software/go-1.2/bin:/usr/local/bin:/opt/local/bin:/usr/bin:/bin:/usr/gnu/bin:/usr/sbin:/sbin
$