what is the proper way to pipe when making a shell

2019-02-14 07:02发布

问题:

I’m attempting to create my own shell I believe i have the forking done correctly but i cannot figure out how to pipe correctly. Any help or tips would be appreciated.

basically my pipes aren’t working and i have spent ages attempting to figure out how to get them to properly transmit the data between the processes.

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>
#include "ourhdr.h" // from Steven's book Advanced programing in the UNIX Enviroment


extern int makeargv(char *, char * , char ***);
int main()  
{   char **argp;
    int i,j,vpret;
    char buf[80];
    pid_t pid;
    int pnum;   
        //pipe number
    int ploc[16];   
        //pipe location 
        //(this has a max of 12 possible pipes)
    int targ;
    int fdleft[2], fdright[2];




    printf("            <(^_^)> \nHello \n I am your console and I am here to help you \n");
    printf("    If you dont need me anymore just say \"bye\" ");
    fflush(stdout);
    write(1,"\n(>^_^)> ",8);
    while(strcmp(fgets(buf, 80, stdin), "bye\n")!=0)
    {        
        j=makeargv(buf," \n",&argp); //this breaks the line up and returns the number of commands 
        pnum = 0;
        ploc[0] = 0;
        if (j > 16) j = 16;
        for (i=0;i<j;i++)
        {
            if ( strcmp(argp[i], "|") == 0)
            {
                argp[i]= NULL;
                ploc[pnum+1] = (i+1);
                pnum++;
            }

        } 

        for (i = 0; i < (pnum+1); i++) 
        {   
            pipe(fdright);
            if (i != 0)
            {   
                dup2(fdright[1], fdleft[0]);            
            }
            pid = fork();


            switch (pid)
            {
                case -1:
                    err_sys("fork failed");
                    break;
                case 0: // child
                    if (i != pnum)
                    {
                        dup2(fdright[1],1);
                    }
                    if ( i != 0);
                    {
                        dup2(fdright[0],0); 
                    }
                    //printf("(^o^) running pipe[%i]\n" , i);
                    targ =(ploc[i]) ;
                    execvp(argp[targ],&argp[targ]);
                    write(1,"(-_-) I'm sorry the exec failed \n",33);
                    exit(1);
                default:

                    dup2(fdleft[1], fdright[1]);
                    waitpid(pid,NULL, 0);
                    close(fdright[0]);
                    close(fdright[1]);
                    //waitpid(pid,NULL, 0);
            }
        }   
        //waitpid(pid, NULL, 0);
        write(1,"\n(>^_^)> ",8);
    }
    printf("  v(^o^)^ BYE BYE!\n");

}

thanks

回答1:

Various comments:

  1. while (strcmp(fgets(buf, 80, stdin), "bye\n")!=0)

    This crashes if the shell is given EOF instead of bye. Don't combine the two function calls like that. If you want it all in one loop condition, then use:

    while (fgets(buf, sizeof(buf), stdin) != 0 && strcmp(buf, "bye\n") != 0)
    

    We'll discuss limits on the length of the command line another time.

  2. Since you've not supplied makeargv() to look at, we have to assume it does its job OK.

  3. Your loop that splits things up into commands and pipes is:

    pnum = 0;
    ploc[0] = 0;
    if (j > 16) j = 16;
    for (i = 0; i < j; i++)
    {
        if (strcmp(argp[i], "|") == 0)
        {
            argp[i] = NULL;
            ploc[pnum+1] = (i+1);
            pnum++;
        }
    }
    

    Let's assume we have a command line input: ls -l | grep lemon. It appears that your makeargv() would return 5 and set argp as follows:

    argp[0] = "ls";
    argp[1] = "-l";
    argp[2] = "|";
    argp[3] = "grep";
    argp[4] = "lemon";
    argp[5] = 0;         // Inferred - things will crash sooner or later if wrong
    

    This loop of yours would give you:

    ploc[0] = 0;
    ploc[1] = 3;
    pnum = 1;
    
  4. Your code has a pair of file descriptors in the fdleft array, but you never initialize the array (call pipe(), for example), even though you use it in calls to dup2().

  5. The main for loop then has to run twice, once for each command. For the general case of three or more commands in a pipeline (who | grep me | sort, for example), your first command (who) needs its standard input untouched, but its standard output going to the pipe that connects who and grep. The 'middle' commands (second to penultimate commands, or grep me in the example) each needs its standard input to come from the previous pipe and needs to create a new pipe for its standard output to go to. The last command (in this example, the third command, sort), needs its standard input to come from the last pipe and its standard output unchanged.

    Your code doesn't do that, or get anywhere close.

  6. When you use pipe() and then dup() or dup2() to map any of the descriptors to standard I/O descriptors, you need to close both ends of the pipe. You simply don't have enough close() calls around either.

  7. Your parent process has to launch each of the children in turn, and only waits for them to exit after it has launched them all. There are different ways of organizing the processes. The parent could fork once; the child could be responsible for launching the leading commands in the pipeline, finally executing the last command itself. The parent has just one direct child (the others are grandchildren), so it only has to wait for the one command to finish. The alternative is that the parent knows about each of the processes in the pipeline and it waits for them all to finish.

  8. If your parent processes waits for each command in a pipeline to complete before launching the remainder, you can end up with a form of deadlock. One child writes so much data to its pipe that it is blocked by the kernel until some process reads from the pipe, but the process to read from the pipe is not yet launched and the parent is waiting for the child to exit. You also lose the benefits of multi-processing and concurrency.

  9. In your 'case 0', you have an extraneous semi-colon. My compiler warned me about it. If yours didn't warn you about it, you need to use more compilation warnings or get a better compiler.

    case 0: // child
        if (i != pnum)
        {
            dup2(fdright[1], 1);
        }
        if (i != 0);     // Unwanted semi-colon!
        {
            dup2(fdright[0], 0); 
        }
    
  10. There are many questions about pipe lines in mini-shells on SO, including:

    • SO 13636252
    • SO 13693446
    • SO 13905948
    • SO 14312939
    • SO 13213864

    The answer for 13636252 is most nearly generic. The only snag is it uses char *** which is apt to be confusing, and it is so tightly written it has mutually recursive functions with minimal repetition. OTOH, it works correctly, and your makeargv() function also uses a char *** argument.


Reworked code

Here is your code reworked so that it works. It includes implementations of err_sys() and makeargv(). My makeargv() simply assumes that there will be fewer than 32 words in the command line. It is not, by any stretch of my imagination, a robust command line parser. It does allow you to type ls | wc and gives the correct answer; it also allows who | grep me | sort and gives the correct answer; it also allows ls and gives the correct answer. The spaces around the pipe symbols are crucial, though (in a normal shell, they're optional, so who|grep me|sort should work too, but it won't with this code.

#include <assert.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>
#include <sys/wait.h>
#include <unistd.h>

extern int  makeargv(char *line, char *seps, char ***args);
extern void err_sys(const char *msg);
static void dump_argv(char **argv);
static void dump_fds(void);

int main(void)
{
    char buf[80];

    printf("            <(^_^)> \nHello \n I am your console and I am here to help you\n");
    printf("    If you don't need me anymore just say \"bye\"\n");
    fflush(stdout);
    dump_fds();
    printf("(>^_^)> ");
    while (fgets(buf, sizeof(buf), stdin) != 0 && strcmp(buf, "bye\n") != 0)
    {
        pid_t pid;
        char **argp;
        int fdleft[2]  = { -1, -1 };
        int fdright[2] = { -1, -1 };
        int pnum = 0;
        int ploc[16];
        int j = makeargv(buf, " \n", &argp);

        ploc[0] = 0;
        if (j > 16)
            j = 16;
        for (int i = 0; i < j; i++)
        {
            if (strcmp(argp[i], "|") == 0)
            {
                argp[i] = NULL;
                ploc[++pnum] = i+1;
            }
        }

        printf("pnum = %d\n", pnum);
        for (int k = 0; k < pnum+1; k++)
            printf("ploc[%d] = %d\n", k, ploc[k]);

        for (int i = 0; i < pnum+1; i++)
        {
            if (i != pnum)
            {
                if (pnum > 0)
                {
                    if (pipe(fdright) != 0)
                        err_sys("pipe");
                    //printf("%d: fdright = { %d, %d }\n", i, fdright[0], fdright[1]);
                    //dump_fds();
                }
            }

            if ((pid = fork()) < 0)
                err_sys("fork failed");
            else if (pid == 0)
            {
                /* Child */
                int targ;
                //dump_fds();
                if (i != pnum)
                {
                    dup2(fdright[1], 1);
                    close(fdright[0]);
                    close(fdright[1]);
                }
                if (i != 0)
                {
                    dup2(fdleft[0], 0);
                    close(fdleft[0]);
                    close(fdleft[1]);
                }
                targ = ploc[i];
                dump_argv(&argp[targ]);
                dump_fds();
                execvp(argp[targ], &argp[targ]);
                fprintf(stderr, "(-_-) I'm sorry the exec failed\n");
                exit(1);
            }

            if (i != 0)
            {
                //dump_fds();
                //printf("%d: fdleft = { %d, %d }\n", i, fdleft[0], fdleft[1]);
                assert(fdleft[0] != -1 && fdleft[1] != -1);
                close(fdleft[0]);
                close(fdleft[1]);
                //dump_fds();
            }

            printf("PID %d launched\n", pid);
            fdleft[0] = fdright[0];
            fdleft[1] = fdright[1];
        }

        //dump_fds();
        //printf("%d: fdleft = { %d, %d }\n", -1, fdleft[0], fdleft[1]);
        close(fdleft[0]);
        close(fdleft[1]);
        free(argp);
        //dump_fds();

        int corpse;
        int status;
        while ((corpse = waitpid(0, &status, 0)) > 0)
            printf(":-( PID %d status 0x%.4X\n", corpse, status);

        printf("\n(>^_^)> ");
    }
    printf("  v(^o^)^ BYE BYE!\n");
}

static void dump_argv(char **argv)
{
    int n = 0;
    char **args;
    args = argv;
    while (*args++ != 0)
        n++;
    fprintf(stderr, "%d: %d args\n", getpid(), n);
    args = argv;
    while (*args != 0)
        fprintf(stderr, "[%s]\n", *args++);
    fprintf(stderr, "EOA\n");
}

/* Report on open file descriptors (0..19) in process */
static void dump_fds(void)
{
    struct stat b;
    char buffer[32];
    sprintf(buffer, "%d: ", getpid());
    char *str = buffer + strlen(buffer);
    for (int i = 0; i < 20; i++)
        *str++ = (fstat(i, &b) == 0) ? 'o' : '-';
    *str++ = '\n';
    *str = '\0';
    fputs(buffer, stderr);
}

int makeargv(char *line, char *seps, char ***args)
{
    enum { MAX_ARGS = 32 };
    char **argv = malloc(32 * sizeof(char *));  // Lazy!
    if (argv == 0)
        err_sys("out of memory in makeargv()");
    int n;
    char **argp = argv;
    char  *str = line;
    for (n = 0; n < MAX_ARGS - 1; n++)
    {
        str += strspn(str, seps);
        if (*str == '\0')
            break;
        *argp++ = str;
        int len = strcspn(str, seps);
        if (len == 0)
            break;
        str[len] = '\0';
        str += len + 1;
    }
    *argp = 0;
    dump_argv(argv);
    *args = argv;
    return(n);
}

void err_sys(const char *msg)
{
    int errnum = errno;
    char *errmsg = strerror(errnum);
    fprintf(stderr, "%s (%d: %s)\n", msg, errnum, errmsg);
    exit(1);
}

Sample output:

$ ./pipes-15673333
            <(^_^)> 
Hello 
 I am your console and I am here to help you
    If you don't need me anymore just say "bye"
29191: ooo-----------------
(>^_^)> who | grep jl | sort
29191: 6 args
[who]
[|]
[grep]
[jl]
[|]
[sort]
EOA
pnum = 2
ploc[0] = 0
ploc[1] = 2
ploc[2] = 5
PID 29194 launched
PID 29195 launched
29194: 1 args
[who]
EOA
PID 29196 launched
29194: ooo-----------------
29195: 2 args
[grep]
[jl]
EOA
29195: ooo-----------------
29196: 1 args
[sort]
EOA
29196: ooo-----------------
:-( PID 29194 status 0x0000
jleffler console  Mar 27 15:11 
jleffler ttys000  Mar 27 16:26 
jleffler ttys001  Mar 27 16:26 
jleffler ttys002  Mar 27 16:26 
jleffler ttys003  Mar 27 16:26 
jleffler ttys004  Mar 27 16:26 
jleffler ttys005  Mar 27 16:26 
:-( PID 29195 status 0x0000
:-( PID 29196 status 0x0000

(>^_^)> ls
29191: 1 args
[ls]
EOA
pnum = 0
ploc[0] = 0
PID 29197 launched
29197: 1 args
[ls]
EOA
29197: ooo-----------------
bash.getopts.update makefile            pipeline.c          pthread-1.c         shuntzeroes.c       timezeromoves.c
cmpfltint.c         mda.c               pipeline.dSYM       pthread-2.c         so.14304827         uint128.c
const-stuff.c       mq-saurabh          pipes-13905948.c    pthread-3.c         so.367309           uname.c
dupdata.sql         mqp-saurabh         pipes-14312939.c    quine.c             so.6964747          unwrap.c
fifocircle.c        multi-pipe-sort.c   pipes-15673333      ranges.sql          so.6965001          xxx.sql
idsdb00246324.ec    multiopts.sh        pipes-15673333.c    recv.c              so.8854855.sql      yyy.sql
incunabulum.c       nextpipe.c          pipes-15673333.dSYM regress.c           strandsort.c
madump.c            pipeline            powa.c              send.c              streplace.c
:-( PID 29197 status 0x0000

(>^_^)> ls -C | wc
29191: 4 args
[ls]
[-C]
[|]
[wc]
EOA
pnum = 1
ploc[0] = 0
ploc[1] = 3
PID 29200 launched
PID 29201 launched
29200: 2 args
29201: 1 args
[ls]
[wc]
[-C]
EOA
EOA
29201: ooo-----------------
29200: ooo-----------------
:-( PID 29200 status 0x0000
      16      46     581
:-( PID 29201 status 0x0000

(>^_^)> bye
  v(^o^)^ BYE BYE!
$


标签: c shell fork pipe