I have been trying to understand fork()
behavior. This time in a for-loop
. Observe the following code:
#include <stdio.h>
void main()
{
int i;
for (i=0;i<3;i++)
{
fork();
// This printf statement is for debugging purposes
// getppid(): gets the parent process-id
// getpid(): get child process-id
printf("[%d] [%d] i=%d\n", getppid(), getpid(), i);
}
printf("[%d] [%d] hi\n", getppid(), getpid());
}
Here is the output:
[6909][6936] i=0
[6909][6936] i=1
[6936][6938] i=1
[6909][6936] i=2
[6909][6936] hi
[6936][6938] i=2
[6936][6938] hi
[6938][6940] i=2
[6938][6940] hi
[1][6937] i=0
[1][6939] i=2
[1][6939] hi
[1][6937] i=1
[6937][6941] i=1
[1][6937] i=2
[1][6937] hi
[6937][6941] i=2
[6937][6941] hi
[6937][6942] i=2
[6937][6942] hi
[1][6943] i=2
[1][6943] hi
I am a very visual person, and so the only way for me to truly understand things is by diagramming. My instructor said there would be 8 hi statements. I wrote and ran the code, and indeed there were 8 hi statements. But I really didn’t understand it. So I drew the following diagram:
Diagram updated to reflect comments :)
Observations:
- Parent process (main) must iterate the loop 3 times. Then printf is called
- On each iteration of parent for-loop a fork() is called
- After each fork() call, i is incremented, and so every child starts a for-loop from i before it is incremented
- At the end of each for-loop, "hi" is printed
Here are my questions:
- Is my diagram correct?
- Why are there two instances of
i=0
in the output? - What value of
i
is carried over to each child after the fork()? If the same value ofi
is carried over, then when does the "forking" stop? - Is it always the case that
2^n - 1
would be a way to count the number of children that are forked? So, heren=3
, which means2^3 - 1 = 8 - 1 = 7
children, which is correct?
i++
is executed after the call offork
, because that's the way thefor
loop works.fork
may fail.A little explanation on the second one:
is similar to:
So
i
in the forked processes(both parent and child) is the value before increment. However, the increment is executed immediately afterfork()
, so in my opinion, the diagram could be treat as correct.Here's how to understand it, starting at the
for
loop.Loop starts in parent,
i == 0
Parent
fork()
s, creating child 1.You now have two processes. Both print
i=0
.Loop restarts in both processes, now
i == 1
.Parent and child 1
fork()
, creating children 2 and 3.You now have four processes. All four print
i=1
.Loop restarts in all four processes, now
i == 2
.Parent and children 1 through 3 all
fork()
, creating children 4 through 7.You now have eight processes. All eight print
i=2
.Loop restarts in all eight processes, now
i == 3
.Loop terminates in all eight processes, as
i < 3
is no longer true.All eight processes print
hi
.All eight processes terminate.
So you get
0
printed two times,1
printed four times,2
printed 8 times, andhi
printed 8 times.To answer your questions one by one:
Yes, essentially. It's a very nice diagram, too.
That is to say, it's correct if you interpret the
i=0
etc. labels as referring to full loop iterations. What the diagram doesn't show, however, is that, after eachfork()
, the part of the current loop iteration after thefork()
call is also executed by the forked child process.Because you have the
printf()
after thefork()
, so it's executed by both the parent process and the just forked child process. If you move theprintf()
before thefork()
, it will only be executed by the parent (since the child process doesn't exist yet).The value of
i
is not changed byfork()
, so the child process sees the same value as its parent.The thing to remember about
fork()
is that it's called once, but it returns twice — once in the parent process, and once in the newly cloned child process.For a simpler example, consider the following code:
The child process created by
fork()
is an (almost) exact clone of its parent, and so, from its own viewpoint, it "remembers" being its parent, inheriting all of the parent process's state (including all variable values, the call stack and the instruction being executed). The only immediate difference (other than system metadata such as the process ID returned bygetpid()
) is the return value offork()
, which will be zero in the child process but non-zero (actually, the ID of the child process) in the parent.Every process that executes a
fork()
turns into two processes (except under unusual error conditions, wherefork()
might fail). If the parent and child keep executing the same code (i.e. they don't check the return value offork()
, or their own process ID, and branch to different code paths based on it), then each subsequent fork will double the number of processes. So, yes, after three forks, you will end up with 2³ = 8 processes in total.