How to make a thread wait for another one in linux

For example I want to create 5 threads and print them. How do I make the fourth one execute before the second one? I tried locking it with a mutex, but I don't know how to make only the second one locked, so it gives me segmentation fault.

Normally, you define the order of operations, not the threads that do those operations. It may sound like a trivial distinction, but when you start implementing it, you'll see it makes for a major difference. It is also more efficient approach, because you don't think of the number of threads you need, but the number of operations or tasks to be done, and how many of them can be done in parallel, and how they might need to be ordered or sequenced.

For learning purposes, however, it might make sense to look at ordering threads instead.

The OP passes a pointer to a string for each worker thread function. That works, but is slightly odd; typically you pass an integer identifier instead:

#include <stdlib.h>
#include <inttypes.h>
#include <pthread.h>

#define  ID_TO_POINTER(id)  ((void *)((intptr_t)(id)))
#define  POINTER_TO_ID(ptr) ((intptr_t)(ptr))

The conversion of the ID type -- which I assume to be a signed integer above, typically either an int or a long -- to a pointer is done via two casts. The first cast is to intptr_t type defined in <stdint.h> (which gets automatically included when you include <inttypes.h>), which is a signed integer type that can hold the value of any void pointer; the second cast is to a void pointer. The intermediate cast avoids a warning in case your ID is of an integer type that cannot be converted to/from a void pointer without potential loss of information (usually described in the warning as "of different size").

The simplest method of ordering POSIX threads, that is not that dissimilar to ordering operations or tasks or jobs, is to use a single mutex as a lock to protect the ID of the thread that should run next, and a related condition variable for threads to wait on, until their ID appears.

The one problem left, is to how to define the order. Typically, you'd simply increment or decrement the ID value -- decrementing means the threads would run in descending order of ID value, but the ID value of -1 (assuming you number your threads from 0 onwards) would always mean "all done", regardless of the number of threads used:

static pthread_mutex_t  worker_lock = PTHREAD_MUTEX_INITIALIZER;
static pthread_cond_t   worker_wait = PTHREAD_COND_INITIALIZER;
static int              worker_id   = /* number of threads - 1 */;

void *worker(void *dataptr)
{
    const int id = POINTER_TO_ID(dataptr);

    pthread_mutex_lock(&worker_lock);
    while (worker_id >= 0) {
        if (worker_id == id) {

            /* Do the work! */
            printf("Worker %d running.\n", id);
            fflush(stdout);

            /* Choose next worker */
            worker_id--;
            pthread_cond_broadcast(&worker_wait);
        }

        /* Wait for someone else to broadcast on the condition. */
        pthread_cond_wait(&worker_wait, &worker_lock);
    }

    /* All done; worker_id became negative.
       We still hold the mutex; release it. */
    pthread_mutex_unlock(&worker_lock);

    return NULL;
}

Note that I didn't let the worker exit immediately after its task is done; this is because I wanted to expand the example a bit: let's say you want to define the order of operations in an array:

static pthread_mutex_t  worker_lock = PTHREAD_MUTEX_INITIALIZER;
static pthread_cond_t   worker_wait = PTHREAD_COND_INITIALIZER;
static int              worker_order[] = { 0, 1, 2, 3, 4, 2, 3, 1, 4, -1 };
static int             *worker_idptr = worker_order;

void *worker(void *dataptr)
{
    const int id = POINTER_TO_ID(dataptr);

    pthread_mutex_lock(&worker_lock);
    while (*worker_idptr >= 0) {
        if (*worker_idptr == id) {

            /* Do the work! */
            printf("Worker %d running.\n", id);
            fflush(stdout);

            /* Choose next worker */
            worker_idptr++;
            pthread_cond_broadcast(&worker_wait);
        }

        /* Wait for someone else to broadcast on the condition. */
        pthread_cond_wait(&worker_wait, &worker_lock);
    }

    /* All done; worker_id became negative.
       We still hold the mutex; release it. */
    pthread_mutex_unlock(&worker_lock);

    return NULL;
}

See how little changed?

Let's consider a third case: a separate thread, say the main thread, decides which thread will run next. In this case, we need two condition variables: one for the workers to wait on, and the other for the main thread to wait on.

static pthread_mutex_t  worker_lock = PTHREAD_MUTEX_INITIALIZER;
static pthread_cond_t   worker_wait = PTHREAD_COND_INITIALIZER;
static pthread_cond_t   worker_done = PTHREAD_COND_INITIALIZER;
static int              worker_id = 0;

void *worker(void *dataptr)
{
    const int id = POINTER_TO_ID(dataptr);

    pthread_mutex_lock(&worker_lock);
    while (worker_id >= 0) {
        if (worker_id == id) {

            /* Do the work! */
            printf("Worker %d running.\n", id);
            fflush(stdout);

            /* Notify we are done. Since there is only
               one thread waiting on the _done condition,
               we can use _signal instead of _broadcast. */
            pthread_cond_signal(&worker_done);
        }

        /* Wait for a change in the worker_id. */
        pthread_cond_wait(&worker_wait, &worker_lock);
    }

    /* All done; worker_id became negative.
       We still hold the mutex; release it. */
    pthread_mutex_unlock(&worker_lock);

    return NULL;
}

The thread that decides which worker should run first should hold the worker_lock mutex when the worker threads are created, then wait on the worker_done condition variable. When the first worker completes its task, it will signal on the worker_cone condition variable, and wait on the worker_wait condition variable. The decider thread should then change the worker_id to the next ID that should run, and broadcast on the worker_wait condition variable. This continues, until the decider thread sets worker_id to a negative value. For example:

int             threads; /* number of threads to create */
pthread_t      *ptids;   /* already allocated for that many */    
pthread_attr_t  attrs;
int             i, result;

/* Simple POSIX threads will work with 65536 bytes of stack
   on all architectures -- actually, even half that. */
pthread_attr_init(&attrs);
pthread_attr_setstacksize(&attrs, 65536);

/* Hold the worker_lock. */
pthread_mutex_lock(&worker_lock);

/* Create 'threads' threads. */
for (i = 0; i < threads; i++) {
    result = pthread_create(&(ptids[i]), &attrs, worker, ID_TO_POINTER(i));
    if (result) {
        fprintf(stderr, "Cannot create worker threads: %s.\n", strerror(result));
        exit(EXIT_FAILURE);
    }
}

/* Thread attributes are no longer needed. */
pthread_attr_destroy(&attrs);

while (1) {

    /* 
       TODO: Set worker_id to a new value, or
             break when done.
    */

    /* Wake that worker */
    pthread_cond_broadcast(&worker_wait);

    /* Wait for that worker to complete */
    pthread_cond_wait(&worker_done, &worker_lock);
}

/* Tell workers to exit */
worker_id = -1;
pthread_cond_broadcast(&worker_wait);

/* and reap the workers */
for (i = 0; i < threads; i++)
    pthread_join(ptids[i], NULL);

There is a very important detail in all of the above examples, that may be hard to understand without a lot of practice: the way how mutexes and condition variables interact (if paired via pthread_cond_wait()).

When a thread calls pthread_cond_wait(), it will atomically release the specified mutex, and wait for new signals/broadcasts on the condition variable. "Atomic" means that there is no time inbetween the two; nothing can occur in between. The call returns when a signal or broadcast is received -- the difference is that a signal goes to only one, a random waiter; whereas a broadcast reaches all threads waiting on the condition variable --, and the thread acquires the lock. You can think of this as if the signal/broadcast first wakes up the thread, but the pthread_cond_wait() will only return when it re-acquires the mutex.

This behaviour is implicitly used in all of the examples above. In particular, you'll notice that the pthread_cond_signal()/pthread_cond_broadcast() is always done while holding the worker_lock mutex; this ensures that the other thread or threads wake up and get to act only after the worker_lock mutex is unlocked -- either explicitly, or by the holding thread waiting on a condition variable.

I thought I might draw a directed graph (using Graphviz) about the order of events and actions, but this "answer" is already too long. I do suggest you do it yourself -- perhaps on paper? -- as that kind of visualization has been very useful for myself when I was learning about all this stuff.

I do feel quite uncomfortable about the above scheme, I must admit. At any one time, only one thread is running, and that is basically wrong: any job where tasks should be done in a specific order, should only require one thread.

However, I showed the above examples in order for you (not just OP, but any C programmer interested in POSIX threads) to get more comfortable about how to use mutexes and condition variables.