Visually what happens to fork() in a For Loop

2019-01-16 07:17发布

问题:

I have been trying to understand fork() behavior. This time in a for-loop. Observe the following code:

#include <stdio.h>

void main()
{
   int i;

   for (i=0;i<3;i++)
   {
      fork();

      // This printf statement is for debugging purposes
      // getppid(): gets the parent process-id
      // getpid(): get child process-id

      printf("[%d] [%d] i=%d\n", getppid(), getpid(), i);
   }

   printf("[%d] [%d] hi\n", getppid(), getpid());
}

Here is the output:

[6909][6936] i=0
[6909][6936] i=1
[6936][6938] i=1
[6909][6936] i=2
[6909][6936] hi
[6936][6938] i=2
[6936][6938] hi
[6938][6940] i=2
[6938][6940] hi
[1][6937] i=0
[1][6939] i=2
[1][6939] hi
[1][6937] i=1
[6937][6941] i=1
[1][6937] i=2
[1][6937] hi
[6937][6941] i=2
[6937][6941] hi
[6937][6942] i=2
[6937][6942] hi
[1][6943] i=2
[1][6943] hi

I am a very visual person, and so the only way for me to truly understand things is by diagramming. My instructor said there would be 8 hi statements. I wrote and ran the code, and indeed there were 8 hi statements. But I really didn’t understand it. So I drew the following diagram:

Diagram updated to reflect comments :)

Observations:

  1. Parent process (main) must iterate the loop 3 times. Then printf is called
  2. On each iteration of parent for-loop a fork() is called
  3. After each fork() call, i is incremented, and so every child starts a for-loop from i before it is incremented
  4. At the end of each for-loop, "hi" is printed

Here are my questions:

  • Is my diagram correct?
  • Why are there two instances of i=0 in the output?
  • What value of i is carried over to each child after the fork()? If the same value of i is carried over, then when does the "forking" stop?
  • Is it always the case that 2^n - 1 would be a way to count the number of children that are forked? So, here n=3, which means 2^3 - 1 = 8 - 1 = 7 children, which is correct?

回答1:

Here's how to understand it, starting at the for loop.

  1. Loop starts in parent, i == 0

  2. Parent fork()s, creating child 1.

  3. You now have two processes. Both print i=0.

  4. Loop restarts in both processes, now i == 1.

  5. Parent and child 1 fork(), creating children 2 and 3.

  6. You now have four processes. All four print i=1.

  7. Loop restarts in all four processes, now i == 2.

  8. Parent and children 1 through 3 all fork(), creating children 4 through 7.

  9. You now have eight processes. All eight print i=2.

  10. Loop restarts in all eight processes, now i == 3.

  11. Loop terminates in all eight processes, as i < 3 is no longer true.

  12. All eight processes print hi.

  13. All eight processes terminate.

So you get 0 printed two times, 1 printed four times, 2 printed 8 times, and hi printed 8 times.



回答2:

  1. Yes, it's correct. (see below)
  2. No, i++ is executed after the call of fork, because that's the way the for loop works.
  3. If all goes successfully, yes. However, remember that fork may fail.

A little explanation on the second one:

for (i = 0;i < 3; i++)
{
   fork();
}

is similar to:

i = 0;
while (i < 3)
{
    fork();
    i++;
}

So i in the forked processes(both parent and child) is the value before increment. However, the increment is executed immediately after fork(), so in my opinion, the diagram could be treat as correct.



回答3:

To answer your questions one by one:

Is my diagram correct?

Yes, essentially. It's a very nice diagram, too.

That is to say, it's correct if you interpret the i=0 etc. labels as referring to full loop iterations. What the diagram doesn't show, however, is that, after each fork(), the part of the current loop iteration after the fork() call is also executed by the forked child process.

Why are there two instances of i=0 in the output?

Because you have the printf() after the fork(), so it's executed by both the parent process and the just forked child process. If you move the printf() before the fork(), it will only be executed by the parent (since the child process doesn't exist yet).

What value of i is carried over to each child after the fork()? If the same value of i is carried over, then when does the "forking" stop?

The value of i is not changed by fork(), so the child process sees the same value as its parent.

The thing to remember about fork() is that it's called once, but it returns twice — once in the parent process, and once in the newly cloned child process.

For a simpler example, consider the following code:

printf("This will be printed once.\n");
fork();
printf("This will be printed twice.\n");
fork();
printf("This will be printed four times.\n");
fork();
printf("This will be printed eight times.\n");

The child process created by fork() is an (almost) exact clone of its parent, and so, from its own viewpoint, it "remembers" being its parent, inheriting all of the parent process's state (including all variable values, the call stack and the instruction being executed). The only immediate difference (other than system metadata such as the process ID returned by getpid()) is the return value of fork(), which will be zero in the child process but non-zero (actually, the ID of the child process) in the parent.

Is it always the case that 2^n - 1 would be a way to count the number of children that are forked? So, here n=3, which means 2^3 - 1 = 8 - 1 = 7 children, which is correct?

Every process that executes a fork() turns into two processes (except under unusual error conditions, where fork() might fail). If the parent and child keep executing the same code (i.e. they don't check the return value of fork(), or their own process ID, and branch to different code paths based on it), then each subsequent fork will double the number of processes. So, yes, after three forks, you will end up with 2³ = 8 processes in total.



标签: c fork