I came across a concept in Advanced Linux Programming. Here's a link: refer to 4.5 GNU/Linux Thread Implementation.
I'm clear with the concept what author say's, but I'm confused with the program he has explained for printing processID's for threads.
Here is the code
#include <pthread.h>
#include <stdio.h>
#include <unistd.h>
void* thread_function (void* arg)
{
fprintf (stderr, "child thread pid is %d\n", (int) getpid ());
/* Spin forever. */
while (1);
return NULL;
}
int main ()
{
pthread_t thread;
fprintf (stderr, "main thread pid is %d\n", (int) getpid ());
pthread_create (&thread, NULL, &thread_function, NULL);
/* Spin forever. */
while (1);
return 0;
}
The output for the above code according to author is
% cc thread-pid.c -o thread-pid -lpthread
% ./thread-pid &
[1] 14608
main thread pid is 14608
child thread pid is 14610
The output I get when I compile is
[1] 3106
main thread pid is 3106
child thread pid is 3106
I understand that to create a thread, linux internally calls clone(most of the cases), same as fork system call does to create a process. The only difference is thread created in process share the same process address space, while process created by a parent process copies the parent process address space. So, what I think is printing process ID in threads result in the same processID. but, its not the same result in book.
Please tellme what is he talking about..? Is the answer wrong in the book/mine..?
I get the same results of the book with linux that contains the libc libuClibc-0.9.30.1.so
(1).
root@OpenWrt:~# ./test
main thread pid is 1151
child thread pid is 1153
and I tried to run this program with a linux that contains the libc from ubuntu libc6
(2)
$ ./test
main thread pid is 2609
child thread pid is 2609
The libc (1) use linuxthreads
implementation of pthread
And the libc (2) use NPTL
("Native posix thread library") implementation of pthread
According to the linuxthreads FAQ (in J.3 answer):
each thread is really a distinct process with a distinct PID, and
signals sent to the PID of a thread can only be handled by that thread
So in the old libc which use linuxthreads
implementation, each thread has its distinct PID
In the new libc version which use NPTL
implementation, all threads has the same PID of the main process.
The NPTL
was developed by redhat team. and according to the redhat NPTL document: One of the problems which are solved in the NPTL
implementation is:
(Chapter: Problems with the Existing Implementation, page5)
Each thread having a different process ID causes compatibility
problems with other POSIX thread implementations. This is in part a
moot point since signals can'tbe used very well but is still
noticeable
And that explain your issue.
You are using the new libc version that contains the NPTL
("Native posix thread library") implementation of pthread
And the Book use an old version of libc that contains linuxthreads
implementation of pthread
The text you're working from is very old (2001). Older versions of Linux implemented threads as separate processes with a shared address space. Each thread had a separate pid. However this thread model was not POSIX compliant and had a number of portability problems.
Starting somewhere around 2.6, Linux switched to the "Native POSIX Thread Library" (NPTL). In this implementation, threads do not get their own PIDs.
All threads created by a process belong to this one process by defintion. To get this one process' process-id use getpid()
, no matter from which of the process' threads.
The author of the document linked is correct, that under Linux (p)threads are implemented as distinct processes sharing the same address space as the one process they belong to. The latter however is not reflected by getpid()
, the author of the linked document is wrong with this assumption.
To get the process-id of the distinct process created for a single thread use (the Linux specific) gettid()
*1.
*1: Please note that the glibc does not provide a wrapper to this function call. Use syscall()
to invoke it.
In linux, as the author pointed out, threads are light-weight processes sharing the same address space. Each process has a unique PID, while each thread has a thread id TID. The thread ID of the main thread serves a dual purpose as its processId also. To get the threadID of the calling thread, you can use the pthread_self()
function.