A C source code (compiled and running Linux Centos 6.3) has the line:
execve(cmd, argv, envp);
execve
does not return, but I want to modify the code to know when it is finished. So I do this:
if (child = fork()) {
waitpid(child, NULL, 0);
/*now I know execve is finished*/
exit(0);
}
execve(cmd, argv, envp);
When I do this, the resulting program works 99% of the time, but very rarely it exhibits strange errors.
Is anything wrong with the above?? I expect the above code to run precisely (except a little slower) as before. Am I correct?
If you want to know the background, the modified code is dash
. The execve
call is used to run a simple command, after dash
has figured out the string to run. When I modify precisely as above (without even running anything after waiting) and recompile and run programs under the modified dash, most of the time they run fine. However, a recompilation of one particular kernel module called "biosutility" gives me this error
cc1: error: unrecognized command line option "-mfentry"
Following Rici's excellent comments and answer, I found the root cause of the problem.
The original code exits with whatever
cmd
exited. I changed that to exit with 0 always. That is why the code behaves differently.The following fix does not exhibit the error:
for this question: "is anything wrong with the above??"
and regarding this code:
the
fork()
function has three kinds of returned value:-1
means an error occurred=0
meansfork()
was successful and the child process is running>0
meansfork()
was successful and the parent process is runningexecvp()
needs to be followed(for the rare case of the call failing) with
fork()
returns apid_t
.After the call, the code needs to be similar to: (using
child
as the pid variable)for your second question, about the error message:
the word:
unrecognized
is mis-spelled, so this is not the actual error message.This error message is not related to your question about the changes you made to
dash
.However, the
dash
does not directly invoke any compile operations, so I suspect the questions are totally unrelated.Suggest looking at the makefile for the
biosutility
utility for why there is a invalid parameter being passed tocc1
.Here's one possibility.
dash
does, in fact, need to know when a child process terminates. It must reap the child (bywait
ing it) to avoid filling the process table with zombies, and anyway it cares about the exit status of the process.Now, it knows what the PID of the process it started was, and it can use that when it does a
wait
to figure out which process terminated and therefore what to do with the exit status.But you are doing an extra
fork
. Sodash
thinks it started some process with PID, say, 368. But youfork
a new child, say PID 723. Then youwait
for that child, but you ignore the status code. Finally, your process terminates successfully. So thendash
notices that process 368 terminated successfully. Even if it didn't.Now suppose
dash
was actually executing a script likeThe programmer has specified that the shell definitely shouldn't
do_something_else
ifdo_something
failed. Terrible things could happen. Or at least mysterious things. Yet, you have hidden that failure. Sodash
cheerfully fires updo_something_else
. Et voilàWell, it's just a theory. I have no idea, really, but it shows the sort of thing that can happen.
The bottom line is that
dash
has some mechanism which lets it know when child processes have finished, and if you want to hook into the exit handling of a child process, you'd be much better off figuring out how that mechanism works so that you can hook into it. Trying to add your own additional mechanism is almost certain to end in tears.Read carefully documentation of execve(2) and of fork(2) and of waitpid(2). Both
execve
&fork
are tricky and fork is difficult to understand. I strongly suggest to read Advanced Linux Programming (freely available online, but you could buy the paper book) which has several chapters for these questions.(Don't be afraid of spending a few days reading and understanding these system calls, they are tricky)
Some important points.
every system call can fail and you should always handle its failure, at least by showing some error message with perror(3) and immediately exit(3)-ing.
the execve(2) syscall usually never returns, since it returns only on failure (when successful, it does not return, since the calling program has been replaced so wiped out!) hence most calls to it (and similar exec(3) functions) are often like:
it is customary to use a weird exit code like 127 (usually unused, except like above) on
execve
failure, and very often you could not do anything else. When used (almost always) withfork
you'll often callexecve
in the child process.the fork(2) syscall returns twice on success (once in parent process, once in child process). This is tricky to understand, read the references I gave. It returns once only on failure. So you always keep the result of
fork
, so typical code would be:Suggestion: use strace(1) on some programs, perhaps try
strace -f bash -c 'date; pwd'
and study the output. It mentions many syscalls(2)....Your sample code might (sometimes) work by just adding some
else
likebut that code is still incorrect because failures are not handled.