Pipe function not executing properly

2019-07-29 07:45发布

I have constructed the following program to try to pipe in my own shell. A StringArray is simply a char** I have constructed. The code runs fine, but when I put in cat txt.txt | grep a, nothing prints back to the screen. When debugging, I saw that the code seems to stop around like 152 (where the print-out command is), where pid==0 and i==0.

For context, I'm calling this function in another function after a pipe has been detected.

void doPipe(StringArray sa) 
{
    printf("In 69\n"); 
    int filedes[2]; // pos. 0 output, pos. 1 input of the pipe
    int filedes2[2];

    int num_cmds = 0;

    char *command[256];

    pid_t pid;

    int err = -1;
    int end = 0;

    // Variables used for the different loops
    int i = 0;
    int j = 0;
    int k = 0;
    int l = 0;

    // First we calculate the number of commands (they are separated
    // by '|')
    while (sa[l] != NULL){
        if (strcmp(sa[l],"|") == 0){
            num_cmds++;
        }
        l++;
    }
    num_cmds++;

    // Main loop of this method. For each command between '|', the
    // pipes will be configured and standard input and/or output will
    // be replaced. Then it will be executed
    while (sa[j] != NULL && end != 1){
        k = 0;
        // We use an auxiliary array of pointers to store the command
        // that will be executed on each iteration
        while (strcmp(sa[j],"|") != 0){
            command[k] = sa[j];
            j++;    
            if (sa[j] == NULL){
                // 'end' variable used to keep the program from entering
                // again in the loop when no more arguments are found
                end = 1;
                k++;
                break;
            }
            k++;
        }
        // Last position of the command will be NULL to indicate that
        // it is its end when we pass it to the exec function
        command[k] = NULL;
        j++;        
        printf("In 121\n"); 

        // Depending on whether we are in an iteration or another, we
        // will set different descriptors for the pipes inputs and
        // output. This way, a pipe will be shared between each two
        // iterations, enabling us to connect the inputs and outputs of
        // the two different commands.
        if (i % 2 != 0){
            pipe(filedes); // for odd i
        }else{
            pipe(filedes2); // for even i
        }

        pid=fork();

        if(pid==-1){            
            if (i != num_cmds - 1){
                if (i % 2 != 0){
                    close(filedes[1]); // for odd i
                }else{
                    close(filedes2[1]); // for even i
                } 
            }           
            printf("Child process could not be created\n");
            return;
        }
        if(pid==0){
            printf("In 148\n"); 

            // If we are in the first command
            if (i == 0){
                printf("In 152\n"); 

                dup2(filedes2[1], STDOUT_FILENO);
            }
            // If we are in the last command, depending on whether it
            // is placed in an odd or even position, we will replace
            // the standard input for one pipe or another. The standard
            // output will be untouched because we want to see the 
            // output in the terminal
            else if (i == num_cmds - 1){
                printf("In 162\n"); 

                if (num_cmds % 2 != 0){ // for odd number of commands
                    dup2(filedes[0],STDIN_FILENO);
                    printf("In 166\n"); 

                }else{ // for even number of commands
                    dup2(filedes2[0],STDIN_FILENO);
                    printf("In 166\n"); 

                }
            // If we are in a command that is in the middle, we will
            // have to use two pipes, one for input and another for
            // output. The position is also important in order to choose
            // which file descriptor corresponds to each input/output
            }else{ // for odd i
                if (i % 2 != 0){
                    dup2(filedes2[0],STDIN_FILENO); 
                    dup2(filedes[1],STDOUT_FILENO);
                }else{ // for even i
                    dup2(filedes[0],STDIN_FILENO); 
                    dup2(filedes2[1],STDOUT_FILENO);                    
                } 
            }

            if (execvp(command[0],command)==err){
                kill(getpid(),SIGTERM);
            }       
        }

        // CLOSING DESCRIPTORS ON PARENT
        if (i == 0){
            close(filedes2[1]);
        }
        else if (i == num_cmds - 1){
            if (num_cmds % 2 != 0){                 
                close(filedes[0]);
            }else{                  
                close(filedes2[0]);
            }
        }else{
            if (i % 2 != 0){                    
                close(filedes2[0]);
                close(filedes[1]);
            }else{                  
                close(filedes[0]);
                close(filedes2[1]);
            }
        }

        waitpid(pid,NULL,0);

        i++;    
    }


}

1条回答
劫难
2楼-- · 2019-07-29 08:20

One of your big problems may be doing waitpid on each iteration of the pipeline construction. The waiting should be done at the end (remembering the pids in a list).

I had some difficulty understanding your code, so I did some simplification and cleanup. In particular, doing if (i % 2 ...) everywhere made things harder.

I've cleaned up and fixed the code. I added a struct to make things easier to manage [please pardon the gratuitous style cleanup]:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

typedef struct {
    int pipe_fildes[2];
} pipectl_t;

#define CLOSEME(_fd) \
    do { \
        close(_fd); \
        _fd = -1; \
    } while (0)

void
doPipe(char **sa)
{
    pipectl_t pipes[2];
    pipectl_t *pipein;
    pipectl_t *pipeout;
    pipectl_t *pipetmp;

    int num_cmds = 0;

    char *command[256];
    pid_t pidlist[256];

    pid_t pid;

    int err = -1;
    int end = 0;

    // Variables used for the different loops
    int icmd = 0;
    int j = 0;
    int k = 0;
    int l = 0;

    // First we calculate the number of commands (they are separated
    // by '|')
    for (int l = 0;  sa[l] != NULL;  ++l) {
        if (strcmp(sa[l], "|") == 0)
            num_cmds++;
    }
    num_cmds++;

    for (int ipipe = 0;  ipipe <= 1;  ++ipipe) {
        pipes[ipipe].pipe_fildes[0] = -1;
        pipes[ipipe].pipe_fildes[1] = -1;
    }

    pipein = &pipes[0];
    pipeout = &pipes[1];

    // Main loop of this method. For each command between '|', the
    // pipes will be configured and standard input and/or output will
    // be replaced. Then it will be executed
    while (sa[j] != NULL && end != 1) {
        // We use an auxiliary array of pointers to store the command
        // that will be executed on each iteration
        k = 0;
        while (strcmp(sa[j], "|") != 0) {
            command[k] = sa[j];
            j++;
            k++;
            if (sa[j] == NULL) {
                // 'end' variable used to keep the program from entering
                // again in the loop when no more arguments are found
                end = 1;
                break;
            }
        }

        // Last position of the command will be NULL to indicate that
        // it is its end when we pass it to the exec function
        command[k] = NULL;

        j++;

        // swap input and output, so previous child's output becomes the new
        // child's input
        // NOTE: by doing this here, in one place, we eliminate all the i % 2
        // if statements
        pipetmp = pipein;
        pipein = pipeout;
        pipeout = pipetmp;

        // are we the last command?
        int lastflg = (icmd == (num_cmds - 1));

        // last command does _not_ have an output pipe, so don't create one
        if (! lastflg)
            pipe(pipeout->pipe_fildes);

        pid = fork();

        // NOTE: fork failure almost never happens and is fatal
        if (pid == -1) {
            printf("Child process could not be created\n");
            return;
        }

        // process child
        if (pid == 0) {
            // NOTE: after we've dup'ed a file descriptor, we close it

            // first command does _not_ have a pipe for input
            if (icmd > 0)
                dup2(pipein->pipe_fildes[0],STDIN_FILENO);
            CLOSEME(pipein->pipe_fildes[0]);

            // last command does _not_ have a pipe for output
            if (! lastflg)
                dup2(pipeout->pipe_fildes[1],STDOUT_FILENO);
            CLOSEME(pipeout->pipe_fildes[1]);

            // close the parent sides of the pipes (in this child)

            // close previous child's output descriptor (the feed for our input)
            CLOSEME(pipein->pipe_fildes[1]);

            // close next child's input descriptor (our feed for its input)
            CLOSEME(pipeout->pipe_fildes[0]);

            if (execvp(command[0], command) == err) {
#if 0
                kill(getpid(), SIGTERM);
#else
                exit(1);
#endif
            }
        }

        // close all input descriptors for _this_ child
        CLOSEME(pipein->pipe_fildes[0]);
        CLOSEME(pipein->pipe_fildes[1]);

        // close output side of _this_ child's output pipe [which becomes next
        // child's input pipe]
        CLOSEME(pipeout->pipe_fildes[1]);

        pidlist[icmd] = pid;

        icmd++;
    }

    // wait for all pids _after_ the entire pipeline is constructed
    for (int icmd = 0;  icmd < num_cmds;  ++icmd)
        waitpid(pidlist[icmd], NULL, 0);
}

// main -- main program
int
main(int argc,char **argv)
{
    char *cp;
    char *bp;
    char buf[1000];
    char **av;
    char *avlist[256];

    --argc;
    ++argv;

    for (;  argc > 0;  --argc, ++argv) {
        cp = *argv;
        if (*cp != '-')
            break;

        switch (cp[1]) {
        default:
            break;
        }
    }

    while (1) {
        printf("> ");
        fflush(stdout);

        cp = fgets(buf,sizeof(buf),stdin);
        if (cp == NULL)
            break;

        av = avlist;
        bp = buf;
        while (1) {
            cp = strtok(bp," \t\r\n");
            bp = NULL;

            if (cp == NULL)
                break;

            *av++ = cp;
        }
        *av = NULL;

        doPipe(avlist);
    }

    return 0;
}

UPDATE:

When I run this code, the same command cat txt.txt | grep a only appears to do the first command, and not the second after the pipe. (It cats out the txt file but does not grep)

I tested the entire program before I posted. I just retested using a cat/grep command. It worked, but that was my program unchanged.

Any ideas why this could be happening? I implemented your doPipe method in my code and passed in my StringArray sa which is just a char ** as well.

My suggestions are:

  1. Verify that my unchanged version works for you.
  2. Using gdb breakpoint on doPipe and look at the arguments. For both programs, they should be the same.
  3. If StringArray is truly char **, replace it in your version to ensure it makes no difference. That is void doPipe(char **sa) and see if your code still compiles. In gdb at the breakpoint, you should be able to do ptype sa on both programs
  4. The StringArray looks a bit "Java-esque" to me :-) I'd avoid it, particularly here since execvp wants a char **
  5. Verify that sa is properly NULL terminated. If it isn't the last command in the pipeline may be bogus/garbage and the error checking for a failed execvp isn't that robust.
  6. Verify that num_cmds is the same.
  7. Try cat txt.txt | grep a | sed -e s/a/b/. If you get the cat and grep, but not the sed, this means num_cmds is not correct
  8. Verify that caller's parsing of the buffer puts the "|" in a separate token. That is, this code works with cat txt.txt | grep a but it will not work with: cat txt.txt|grep a

UPDATE #2:

BTW, if your pipe code still isn't working (e.g. the last command is not executed), check to see if the last token has newline on it (i.e. the newline wasn't stripped correctly).

I've tried all of this but still can't get my redirection code to work with this. Essentially, I'm confused as to where in this code I should be checking for '<' or '>'

Doing the general parsing to support redirection (e.g. < or >), pipes (e.g. |), multiple commands per line (e.g. ;), embedded sub-shells (e.g. (echo the date is ; date), and detached jobs (e.g. &) can require a bit of care and you need a multilevel approach.

I suspect that after you get pipes and/or redirection working, you're tasked with implementing even more shell syntax. I've done this before, so, rather than you figuring it out piecemeal, here is what you'll need to do ...

You'll need to scan the input buffer char-by-char and save off tokens into a "token" struct that also has a type. You'll need a linked list of those structs. More on this below.

When you encounter a quoted string, you'll need to strip off the quotes: "abc" --> abc, being mindful of escaped quotes: "ab\"c --> ab"c.

Also, you have to be careful about quoted strings abutting [what perl calls] "bareword" strings: echo abc. If we have abc"d ef"ghi, this needs to be concatenated into a single string token: abcd efghi

Backslashes on redirectors must also be accounted for. echo abc > def is a redirection that will put abc into the file def. But, echo abc \> def should just output abc > def literally to stdout. Backslash on the other "punctuation" is similar.

You'll also have to handle the fact that punctuation doesn't have to have whitespace around it. That is, echo abc>def has to be handled just as if it were echo abc > def.

Also, punctuation inside a quoted string should be treated as if it were escaped above. That is, echo abc ">" def is not a redirection and [again] should be treated as a simple command.

Also, if the current line ends in a backslash (e.g. \<newline>), this means that the next line is a "continuation" line. You should strip the backslash and the newline. Then, read another line and continue to build up the token list.

Further, while & can be for detached jobs, as in: date &, it can also be part of a redirection, as in gcc -o myshell myshell.c 2>&1 >logfile

Okay, so to manage all this we need types for tokens and a token struct:

// token types
typedef enum {
    TOKEN_NORMAL,                       // simple token/string
    TOKEN_QUO1,                         // quoted string
    TOKEN_QUO2,                         // quoted string
    TOKEN_SEMI,                         // command separater (e.g. ;)
    TOKEN_OREDIR,                       // output redirector (e.g. >)
    TOKEN_IREDIR,                       // input redirector (e.g. <)
    TOKEN_PIPE,                         // pipe separater (e.g. |)
    TOKEN_AMP                           // an & (can be detach or redirect)
} toktype_t;

// token control
typedef struct token token_t;
struct token {
    token_t *tok_next;                  // forward link
    token_t *tok_prev;                  // backward link
    toktype_t tok_type;                 // token type
    char tok_str[256];                  // token value
};

// token list
typedef struct tlist tlist_t;
struct token {
    tlist_t *tlist_next;                // forward link
    tlist_t *tlist_prev;                // backward link

    token_t *tlist_head;                // pointer to list head
    token_t *tlist_tail;                // pointer to list tail
};

Initially, after parsing an input line [being mindful of the continuations], we have a single tlist.

If the list has ; separaters in it, we split on them to create sublists. We then loop on the sublists and execute the commands in order.

When looking at a subcommand, if it ends in &, the command must be run detached. We note that and pop it off the back of the list.

Okay, now we have a list that might be of the form:

cat < /etc/passwd | grep root | sed -e s/root/admin/ > /tmp/out

Now, we do a further split on the | so we have a list that has three elements:

cat < /etc/passwd
grep root
sed -e s/root/admin/ > /tmp/out

Actually, each of those "lines" is a tlist and this is a two dimensional list of lists:

list_of_tlists:
  |
  |
tlist[0] --> cat --> < --> /etc/passwd
  |
  |
tlist[1] --> grep --> root
  |
  |
tlist[2] --> sed --> -e --> s/root/admin/ --> > /tmp/out

As we create the pipeline, we note the redirections and do file open instead of pipe as needed.

Okay, that's the abstract.

See my answer here: Implementing input/output redirection in a Linux shell using C for a full and complete implementation.

On that page, there is code for just doing redirections. It could probably be adapted to include pipes by merging that code with the code I've posted here.

That OP asked for help in doing redirections and pipes.

Side note: At that time, there was a spate of shell implementation questions. So, I ended up producing a full shell that does pretty much everything. But, that version was too large to post on SO. So, in that page, find the pastebin link I posted. It has full source code. It can be downloaded, built, and run.

You may not want to use that code directly, but it should give you some ideas. Also, the full version may do things a little bit differently than what I've described above.

查看更多
登录 后发表回答