Barriers & Conditions

Barriers

A barrier is a point in our program we would like all threads to reach before continuing on. In the diagram below, there is a barrier between section 1 and section 2:

This can be useful for several reasons:

Multi-part computations.
Timing calculations.
Debugging.

In order to implement a barrier, we need all but the slowest thread to wait at the barrier before continuing past it:

Barrier Example

In the program below, say we want to introduce a barrier after each thread is done the "computation".


#include <unistd.h>
#include <stdlib.h>
#include <pthread.h>
#include <stdio.h>

/* the number of threads */
int num_threads;

/* the function called for each thread */
void* worker(void* idp) {
    /* get our thread id */
    int id = * (int*) idp;

    /* we start to work */
    printf("Thread %d starting!\n", id);

    /* simulate the threads taking slightly different amounts of time by sleeping
     * for our thread id seconds */
    sleep(id);
    printf("Thread %d is done its work!\n", id);


    /* TODO create a barrier here! */




    printf("Thread %d is past the barrier!\n", id); 
    return NULL;
}

int main (int argc, char** argv) {
    /* get the number of threads */
    if (argc < 2) {
        printf("Pass the number of threads to use!\n");
        return 0;
    } else {
        num_threads = atoi(argv[1]);
    }

    /* an array of threads */
    pthread_t threads[num_threads];
    int ids[num_threads];

    /* spawn all threads */
    for (int i = 0; i < num_threads; i++) {
        ids[i] = i;
        pthread_create(&threads[i], NULL, worker, &ids[i]);
    }

    /* join all threads */
    for (int i = 0; i < num_threads; i++) {
        pthread_join(threads[i], NULL);
    }

    return 0;
}

If we had a barrier in place, what should the output of this program be?

Busy-wait Barriers

How could we implement a barrier in the above program with a busy-wait loop?

What are the downsides of this?

Semaphore Barriers

We can also use semaphores to implement a barrier. The basic idea is:

Set a global counter to 0. This represents the number of threads that have so far reached the barrier.
We have one semaphore which is used to protect the counter variable (initialized to 1 - unlocked).
We have a second semaphore which is used to block threads in the barrier (initialized to 0 - locked).

Upon reaching the barrier, each thread will wait until the counter variable is free (waiting on the first semaphore).

It then will check if that counter is equal to (NUM_THREADS - 1):

If not, then it is not the last thread to enter the barrier. It increments the counter, then releases it. Finally it waits on the second semaphore, the one keeping track of the barrier.
If so, then it is the last thread to enter the barrier. It is responsible for freeing all of the others from their waiting. It posts to the second semaphore (NUM_THREADS - 1) times - one for each waiting thread.

This example implements this technique:


/* the function called for each thread */
void* worker(void* idp) {
    /* get our thread id */
    int id = * (int*) idp;

    /* we start to work */
    printf("Thread %d starting!\n", id);

    /* simulate the threads taking slightly different amounts of time by sleeping
     * for our thread id seconds */
    sleep(id);
    printf("Thread %d is done its work!\n", id);




    /***************** begin barrier here *****************/
    /* wait for the counter first */
    sem_wait(&count_sem);

    /* check if we are the last one to the barrier */
    if (counter == (num_threads - 1)) {
        /* we are the last thread */

        /* reset counter and free it */
        counter = 0;
        sem_post(&count_sem);

        /* post once for each waiting thread */
        for (int i = 0; i < (num_threads - 1); i++) {
            sem_post(&barrier_sem);
        }
    } else {
        /* we are NOT the last thread */

        /* increment the counter and free it */
        counter++;
        sem_post(&count_sem);

        /* now wait to be freed */
        sem_wait(&barrier_sem);
    }
    /***************** end barrier here *****************/



    printf("Thread %d is past the barrier!\n", id); 
    return NULL;
}

Condition Variables

A somewhat more straightforward approach to creating a barrier is to use a condition variable. A condition variable is used to communicate events between threads. Conditions are always associated with a mutex.

Condition variables are declared with:


pthread_cond_t condition;

And initialized with:


pthread_cond_init(pthread_cond_t* condition, pthread_condattr_t* attributes);

As with a mutex, we usually pass NULL for the attributes.

We can wait on a condition with:


pthread_cond_wait(pthread_cond_t* condition, pthread_mutex_t* mutex);

This function unlocks the given mutex, then waits for a thread to signal the condition passed in. Finally it re-locks the mutex.

A thread can signal that the condition has occurred with either:


pthread_cond_signal(pthread_cond_t* condition);
pthread_cond_broadcast(pthread_cond_t* condition);

The first will unblock one of the waiting threads. The second will unblock all of the threads.

Condition Barrier Example

We can create a condition that represents the event "all threads have reached the barrier". This leads to code like this example:


/* our counter and condition */
int counter = 0;
pthread_cond_t condition;
pthread_mutex_t condition_mutex;


/* the function called for each thread */
void* worker(void* idp) {
    /* get our thread id */
    int id = * (int*) idp;

    /* we start to work */
    printf("Thread %d starting!\n", id);

    /* simulate the threads taking slightly different amounts of time by sleeping
     * for our thread id seconds */
    sleep(id);
    printf("Thread %d is done its work!\n", id);


    /***************** begin barrier here *****************/
    /* lock the condition and increment the counter */
    pthread_mutex_lock(&condition_mutex);
    counter++;

    /* if all the threads have incremented counter */
    if (counter == num_threads) {
        /* reset the counter and broadcast the event to all waiting threads */
        counter = 0;
        pthread_cond_broadcast(&condition);
    } else {
        /* otherwise not all threads are here yet, so we need to wait */
        pthread_cond_wait(&condition, &condition_mutex);
    }
    /* unlock the condition mutex */
    pthread_mutex_unlock(&condition_mutex);
    /***************** end barrier here *****************/




    printf("Thread %d is past the barrier!\n", id); 
    return NULL;
}

mutexes vs. read-write locks vs. semaphores vs. conditions

mutex
- Owned by one particular thread
- Only two values locked/unlocked
- Used to protect critical sections
read-write lock
- Similar to a mutex, but distinguishes between reading and writing
- Can be more efficient when structures are read more often than written
semaphore
- Can be shared by different threads
- Can be any unsigned value (0 is locked, > 0 is unlocked)
- Used to send a signal from one thread to another
condition variable
- Can be shared by different threads
- One thread can unlock all waiting threads (one to many communication)
- Used to signal a condition that multiple threads need to be aware of