I know that fork() returns differently for the child and parent processes, but I'm unable to find information on how this happens. How does the child process receive the return value 0 from fork? And what is the difference in regards to the call stack? As I understand it, for the parent it goes something like this:
parent process--invokes fork-->system_call--calls fork-->fork executes--returns to-->system_call--returns to-->parent process.
What happens in the child process?
The
fork()
system call returns twice (unless it fails).One of the returns is in the child process, and there the return value is 0.
The other return is in the parent process, and there the return value is non-zero (either negative if the fork failed, or a non-zero value indicating the PID of the child).
The main differences between the parent and the child are:
Other more obscure differences are listed in the POSIX standard.
In one sense, the How really isn't your problem. The operating system is required to achieve the result. However, the o/s clones the parent process, making a second child process which is an almost exact replica of the parent, setting the attributes that must be different to the correct new values, and usually marking the data pages as CoW (copy on write) or equivalent so that when one process modifies a value, it gets a separate copy of the page so as not to interfere with the other. This is not like the deprecated (by me at least - non-standard for POSIX)
vfork()
system call which you would be wise to eschew even if it is available on your system. Each process continues after thefork()
as if the function returns - so (as I said up top), thefork()
system call returns twice, once in each of two processes which are near identical clones of each other.I will try to answer from the process memory layout point of view. Guys, please correct me if anything wrong or inaccurate.
fork() is the only system call for process creation (except the very beginning process 0), so the question is actually what happens with process creation in kernel. There are two kernel data structures related with process, struct proc array (aka process table) and struct user (aka u area).
To create a new process, these two data structures have to be properly created or parameterized. The straight-forward way is to align with the creater's (or parent's) proc & u area. Most data are duplicated between parent & child (e.g., the code segment), except the values in the return register (e.g. EAX in 80x86), for which parent is with child's pid and child is 0. Since then, you have two processes (existing one & new one) run by the scheduler, and upon the scheduling, each will return their values respectively.
% man fork
What happens is that inside the fork system call, the entire process is duplicated. Then, the fork call in each returns. These are different contexts now though, so they can return different return codes.
If you really want to know how it works at a low level, you can always check the source! The code is a bit confusing if you're not used to reading kernel code, but the inline comments give a pretty good hint as to what's going on.
The most interesting part of the source with an explicit answer to your question is at the very end of the fork() definition itself -
"td" apparently holds a list of the return values for different threads. I'm not sure exactly how this mechanism works (why there are not two separate "thread" structures). If error (returned from fork1, the "real" forking function) is 0 (no error), then take the "first" (parent) thread and set its return value to p2 (the new process)'s PID. If it's the "second" thread (in p2), then set the return value to 0.
The process appears identical from both sides, except for the differing return value (that's why the return value is there, so that the two processes can tell the difference at all!). As far as the son process is concerned, it will have just been returned to from system_call in the same manner that the parent process was returned to.
Both parent and child returns different values because of manipulation of CPU registers in child's context.
Each process in linux kernel represented by task_struct. task_struct is encased(pointer) in thread_info structure which lies at the end of kernel mode stack.Whole CPU context(registers) are stored in this thread_info structure.
All fork/clone() system calls calls kernel equivalent function do_fork().
Here is the sequence of execution
do_fork()->copy_process->copy_thread() (copy_thread is arch specific function call)
copy_thread() copies the register values from the parent and changes the return value to 0 (In case of arm)
When the child gets scheduled it executes a assembly routine ret_from_fork() which will returns zero. For the parent it gets the return value from the do_fork() which is pid of process
Steven Schlansker's answer is quite good, but just to add some more detail:
Every executing process has an associated context (hence "context switching") - this context includes, among other things, the process's code segment (containing the machine instructions), its heap memory, its stack, and its register contents. When a context switch occurs, the context from the old process is saved, and the context from the new process is loaded.
The location for a return value is defined by the ABI, to allow code interoperability. If I am writing ASM code for my x86-64 processor, and I call into the C runtime, I know that the return value is going to show up in the RAX register.
Putting these two things together, the logical conclusion is that the call to
int pid = fork()
results in two contexts where the next instruction to execute in each one is one that moves the value of RAX (the return value from thefork
call) into the local variablepid
. Of course, only one process can execute at a time on a single cpu, so the order in which these "returns" happens will be determined by the scheduler.