how to correctly wait for execve to finish?

2019-02-26 07:44发布

A C source code (compiled and running Linux Centos 6.3) has the line:

execve(cmd, argv, envp);

execve does not return, but I want to modify the code to know when it is finished. So I do this:

if (child = fork()) {
    waitpid(child, NULL, 0);
    /*now I know execve is finished*/
    exit(0);
}

execve(cmd, argv, envp);

When I do this, the resulting program works 99% of the time, but very rarely it exhibits strange errors.

Is anything wrong with the above?? I expect the above code to run precisely (except a little slower) as before. Am I correct?

If you want to know the background, the modified code is dash. The execve call is used to run a simple command, after dash has figured out the string to run. When I modify precisely as above (without even running anything after waiting) and recompile and run programs under the modified dash, most of the time they run fine. However, a recompilation of one particular kernel module called "biosutility" gives me this error

cc1: error: unrecognized command line option "-mfentry"

标签: c linux shell
4条回答
Luminary・发光体
2楼-- · 2019-02-26 08:25

Following Rici's excellent comments and answer, I found the root cause of the problem.

The original code exits with whatever cmd exited. I changed that to exit with 0 always. That is why the code behaves differently.

The following fix does not exhibit the error:

int status;

if (child = fork()) {
        waitpid(child, &status, 0);
        /*now we know execve is finished*/
        if (WIFEXITED(status))
            exit(WEXITSTATUS(status));
        exit(1);
    }

execve(cmd, argv, envp);
查看更多
做个烂人
3楼-- · 2019-02-26 08:25

for this question: "is anything wrong with the above??"

and regarding this code:

if (child = fork()) {
    waitpid(child, NULL, 0);
    /*now I know execve is finished*/
    exit(0);
}

execve(cmd, argv, envp);
  1. the fork() function has three kinds of returned value:

    • -1 means an error occurred
    • =0 means fork() was successful and the child process is running
    • >0 means fork() was successful and the parent process is running
  2. the call to execvp() needs to be followed

(for the rare case of the call failing) with

 perror( "execvp failed" );
 exit( EXIT_FAILURE );
  1. the call to fork() returns a pid_t.

After the call, the code needs to be similar to: (using child as the pid variable)

if( 0 > child )
{
    perror( "fork failed");
    exit( EXIT_FAILURE );
}

else if( 0 == child )
{ // then child process
    execve(cmd, argv, envp);
    perror( "execvp failed" );
    exit( EXIT_FAILURE );
}

//else
//{ // else parent process
waitpid(child, NULL, 0);
exit( EXIT_SUCCESS );

for your second question, about the error message:

cc1: error: unrecognized command line option "-mfentry"

the word: unrecognized is mis-spelled, so this is not the actual error message.

This error message is not related to your question about the changes you made to dash.

However, the dash does not directly invoke any compile operations, so I suspect the questions are totally unrelated.

Suggest looking at the makefile for the biosutility utility for why there is a invalid parameter being passed to cc1.

查看更多
Lonely孤独者°
4楼-- · 2019-02-26 08:26

Here's one possibility.

dash does, in fact, need to know when a child process terminates. It must reap the child (by waiting it) to avoid filling the process table with zombies, and anyway it cares about the exit status of the process.

Now, it knows what the PID of the process it started was, and it can use that when it does a wait to figure out which process terminated and therefore what to do with the exit status.

But you are doing an extra fork. So dash thinks it started some process with PID, say, 368. But you fork a new child, say PID 723. Then you wait for that child, but you ignore the status code. Finally, your process terminates successfully. So then dash notices that process 368 terminated successfully. Even if it didn't.

Now suppose dash was actually executing a script like

do_something && do_something_else

The programmer has specified that the shell definitely shouldn't do_something_else if do_something failed. Terrible things could happen. Or at least mysterious things. Yet, you have hidden that failure. So dash cheerfully fires up do_something_else. Et voilà

Well, it's just a theory. I have no idea, really, but it shows the sort of thing that can happen.

The bottom line is that dash has some mechanism which lets it know when child processes have finished, and if you want to hook into the exit handling of a child process, you'd be much better off figuring out how that mechanism works so that you can hook into it. Trying to add your own additional mechanism is almost certain to end in tears.

查看更多
你好瞎i
5楼-- · 2019-02-26 08:34

Read carefully documentation of execve(2) and of fork(2) and of waitpid(2). Both execve & fork are tricky and fork is difficult to understand. I strongly suggest to read Advanced Linux Programming (freely available online, but you could buy the paper book) which has several chapters for these questions.

(Don't be afraid of spending a few days reading and understanding these system calls, they are tricky)

Some important points.

  • every system call can fail and you should always handle its failure, at least by showing some error message with perror(3) and immediately exit(3)-ing.

  • the execve(2) syscall usually never returns, since it returns only on failure (when successful, it does not return, since the calling program has been replaced so wiped out!) hence most calls to it (and similar exec(3) functions) are often like:

    if (execve(cmd, argv, envp)) { perror (cmd); exit(127); };
    /* else branch cannot be reached! */
    

    it is customary to use a weird exit code like 127 (usually unused, except like above) on execve failure, and very often you could not do anything else. When used (almost always) with fork you'll often call execve in the child process.

  • the fork(2) syscall returns twice on success (once in parent process, once in child process). This is tricky to understand, read the references I gave. It returns once only on failure. So you always keep the result of fork, so typical code would be:

     pid_t pid = fork ();
     if (pid<0) { // fork has failed 
        perror("fork"); exit(EXIT_FAILURE);
     }
     else if (pid==0) { // successful fork in the child process
       // very often you call execve in child, so you don't continue here.
       // example code:
       if (execve(cmd, argv, envp)) { perror (cmd); exit(127); };
       // not reached! 
    };
    // here pid is positive, we are in the parent and fork succeeded....
    /// do something sensible, at some point you need to call waitpid and use pid
    

Suggestion: use strace(1) on some programs, perhaps try strace -f bash -c 'date; pwd' and study the output. It mentions many syscalls(2)....

Your sample code might (sometimes) work by just adding some else like

// better code, but still wrong because of unhandled failures....
if ((child = fork())>0) { 
  waitpid(child, NULL, 0);
  /*now I know execve is finished*/
  exit(0);
}
/// missing handling of `fork`  failure!
else if (!child) {
   execve(cmd, argv, envp);
   /// missing handling of `execve` failure
}

but that code is still incorrect because failures are not handled.

查看更多
登录 后发表回答