Understanding Pthreads

2020-02-10 06:39发布

问题:

I came across a concept in Advanced Linux Programming. Here's a link: refer to 4.5 GNU/Linux Thread Implementation.

I'm clear with the concept what author say's, but I'm confused with the program he has explained for printing processID's for threads.

Here is the code

#include <pthread.h>
#include <stdio.h>
#include <unistd.h>
void* thread_function (void* arg)
{
    fprintf (stderr, "child thread pid is %d\n", (int) getpid ());
    /* Spin forever. */
    while (1);
    return NULL; 
}

int main ()
{
    pthread_t thread;
    fprintf (stderr, "main thread pid is %d\n", (int) getpid ());
    pthread_create (&thread, NULL, &thread_function, NULL);
    /* Spin forever. */
    while (1);
    return 0;
} 

The output for the above code according to author is

% cc thread-pid.c -o thread-pid -lpthread
% ./thread-pid &
[1] 14608
main thread pid is 14608
child thread pid is 14610 

The output I get when I compile is

[1] 3106
main thread pid is 3106
child thread pid is 3106

I understand that to create a thread, linux internally calls clone(most of the cases), same as fork system call does to create a process. The only difference is thread created in process share the same process address space, while process created by a parent process copies the parent process address space. So, what I think is printing process ID in threads result in the same processID. but, its not the same result in book.

Please tellme what is he talking about..? Is the answer wrong in the book/mine..?

回答1:

I get the same results of the book with linux that contains the libc libuClibc-0.9.30.1.so (1).

root@OpenWrt:~# ./test
main thread pid is 1151
child thread pid is 1153

and I tried to run this program with a linux that contains the libc from ubuntu libc6 (2)

$ ./test
main thread pid is 2609
child thread pid is 2609

The libc (1) use linuxthreads implementation of pthread

And the libc (2) use NPTL ("Native posix thread library") implementation of pthread

According to the linuxthreads FAQ (in J.3 answer):

each thread is really a distinct process with a distinct PID, and signals sent to the PID of a thread can only be handled by that thread

So in the old libc which use linuxthreads implementation, each thread has its distinct PID

In the new libc version which use NPTL implementation, all threads has the same PID of the main process.

The NPTL was developed by redhat team. and according to the redhat NPTL document: One of the problems which are solved in the NPTL implementation is:

(Chapter: Problems with the Existing Implementation, page5)

Each thread having a different process ID causes compatibility problems with other POSIX thread implementations. This is in part a moot point since signals can'tbe used very well but is still noticeable


And that explain your issue.

You are using the new libc version that contains the NPTL ("Native posix thread library") implementation of pthread

And the Book use an old version of libc that contains linuxthreads implementation of pthread



回答2:

The text you're working from is very old (2001). Older versions of Linux implemented threads as separate processes with a shared address space. Each thread had a separate pid. However this thread model was not POSIX compliant and had a number of portability problems.

Starting somewhere around 2.6, Linux switched to the "Native POSIX Thread Library" (NPTL). In this implementation, threads do not get their own PIDs.



回答3:

All threads created by a process belong to this one process by defintion. To get this one process' process-id use getpid(), no matter from which of the process' threads.

The author of the document linked is correct, that under Linux (p)threads are implemented as distinct processes sharing the same address space as the one process they belong to. The latter however is not reflected by getpid(), the author of the linked document is wrong with this assumption.

To get the process-id of the distinct process created for a single thread use (the Linux specific) gettid()*1.


*1: Please note that the glibc does not provide a wrapper to this function call. Use syscall() to invoke it.



回答4:

In linux, as the author pointed out, threads are light-weight processes sharing the same address space. Each process has a unique PID, while each thread has a thread id TID. The thread ID of the main thread serves a dual purpose as its processId also. To get the threadID of the calling thread, you can use the pthread_self() function.