I read here that clone()
system call is used to create a thread in Linux. Now the syntax of clone()
is such that a starting routine/function address is needed to be passed to it.
But here on this page it is written that fork()
calls clone()
internally. So my question is how do child process created by fork()
starts running the part of code which is after fork()
call, i.e. how does it not require a function as starting point?
If the links I provided have incorrect info, then please guide me to some better links/resources.
Thanks
For questions like this, always read the source code.
From glibc's
nptl/sysdeps/unix/sysv/linux/fork.c
(GitHub) (nptl
= native Posix threads for Linux) we can find the implementation offork()
, which is definitely not a syscall, we can see that the magic happens inside theARCH_FORK
macro, which is defined as an inline call toclone()
innptl/sysdeps/unix/sysv/linux/x86_64/fork.c
(GitHub). But wait, no function or stack pointer is passed to this version ofclone()
! So, what is going on here?Let's look at the implementation of
clone()
in glibc, then. It's insysdeps/unix/sysv/linux/x86_64/clone.S
(GitHub). You can see that what it does is it saves the function pointer on the child's stack, calls the clone syscall, and then the new process will read pop the function off the stack and then call it.So it works like this:
And
fork()
is...Summary
The actual
clone()
syscall does not take a function argument, it just continues from the return point, just likefork()
. So both theclone()
andfork()
library functions are wrappers around theclone()
syscall.Documentation
My copy of the manual is somewhat more upfront about the fact that
clone()
is both a library function and a system call. However, I do find it somewhat misleading thatclone()
is found in section 2, rather than both section 2 and section 3. From the man page:And,
Finally,
@Dietrich did a great job explaining by looking at the implementation. That's amazing! Anyway, there's another way of discovering that: by looking at the calls strace "sniffs".
We can prepare a very simple program that uses
fork(2)
and then check our hypothesis (i.e, that there's nofork
syscall really happening).Now, compile that code (
clang -Wall -g fork.c -o fork.out
) and then execute it withstrace
:This will intercept system calls called by our process (with
-f
we also intercept the child's calls) and then put those calls into./fork.trace.log
;-c
option gives us a summary at the end). The result in my machine (Ubuntu 14.04, x86_64 Linux 3.16) is (summarized):As expected, no
fork
calls. Just the rawclone
syscall with its flags, child stack and etc properly set.