Introduction to MPI

Multi-Process Programming

Both Pthreads and OpenMP are for writing multi-threaded programs.

MPI (Message Passing Interface) is for writing multi-process ones. This is quite a different model:

Each process has its own address space, so variables are not implicitly shared.
Data must be shared explicitly, via messages.
The processes can run on the same machine, or be distributed.

Hello World

Hello world in MPI might look like this:


#include <stdio.h>
#include <mpi.h>

int main(int argc, char** argv) {
    int rank, size;

    /* initialize MPI */
    MPI_Init(&argc, &argv);

    /* get the rank (process id) and size (number of processes) */
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    /* print a hello message */
    printf("Hello from process %d of %d!\n", rank, size);

    /* quit MPI */
    MPI_Finalize();
    return 0;
}

You might notice that, unlike Pthreads or OpenMP, there is no code to create any threads/processes. That is because the entire program is run by each process.

Compiling and Running

To compile MPI programs you cannot use the standard gcc command. You must use the mpicc command:


$ mpicc hello.c

This command is a wrapper over gcc that adds the necessary compiler and linker flags.

We can then run our program in the standard way:


$ ./a.out

However that runs it only as one process!

In order to launch multiple processes, we use the mpirun command:


$ mpirun -np 8 ./a.out

This tells MPI to launch 8 processes running our program, and give them each a rank.

Disabling Warnings

If you run MPI and get warnings about missing network interfaces such as:

--------------------------------------------------------------------------
[[35292,1],0]: A high-performance Open MPI point-to-point messaging module
was unable to find any relevant network interfaces:

Module: OpenFabrics (openib)
  Host: cs

Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------

Then add the following to your .bash_profile:


export OMPI_MCA_btl_base_warn_component_unused=0

MPI and C++

We can also use MPI with C++ programs.


#include <iostream>
#include <mpi.h>

int main(int argc, char** argv) {
    int rank, size;

    /* initialize MPI */
    MPI_Init(&argc, &argv);

    /* get the rank (process id) and size (number of processes) */
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    /* print a hello message */
    std::cout << "Hello from process " << rank <<  " of " << size << "!" << std::endl;

    /* quit MPI */
    MPI_Finalize();
    return 0;
}

The procedure for compiling and running is nearly the same, except we use "mpic++" instead of "mpicc":


$ mpic++ hello.cpp 
$ mpirun -np 8 ./a.out

The Message Passing Paradigm

Threads communicate by passing messages amongst themselves. Messages are simply blocks of data.

Messages may be passed "point-to-point", where one process sends a message and another receives it.

Messages may also be broadcast from one process to the rest.

Example: Parallel Sum

How could write an algorithm to sum a range of numbers, in parallel, using a message passing model?

Using MPI Send & Receive

The following two functions can be used for point-to-point communication in MPI:


MPI_Send(void* buffer, int count, MPI_Datatype type, int destination, int tag, MPI_Comm comm);

MPI_Recv(void* buffer, int count, MPI_Datatype type, int source, int tag, MPI_Comm comm, MPI_Status* status);

These functions have several parameters:

buffer
In a send call, a pointer to the data to be sent. In a receive call, a pointer to the place the data should be written.
count
The number of data elements to be sent or received.
type
The type of data to be sent or received. Along with the count, this tells MPI how many bytes will be sent/received. Below is a partial list:
- MPI_CHAR
- MPI_SHORT
- MPI_INT
- MPI_LONG
- MPI_FLOAT
- MPI_DOUBLE
- MPI_BYTE
destination/source

The rank of the process we want to send data to / receive data from.
tag

An arbitrary integer that identifies what we are communicating. MPI uses this to match sends with receives.
comm
The "communication group" of the operation. This is normally MPI_COMM_WORLD.
status

In a receive operation, this allows us to see how many bytes were actually received.

Example

Below is an MPI implementation of the sum program. It uses the simple method of all processes passing their partial sums to the first process.


#include <stdlib.h>
#include <mpi.h>
#include <stdio.h>

#define START 0
#define END 100

int main(int argc, char** argv) {
    int rank, size;

    /* initialize MPI */
    MPI_Init(&argc, &argv);

    /* get the rank (process id) and size (number of processes) */
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    /* calculate the start and end points by evenly dividing the range */
    int start = ((END - START) / size) * rank;
    int end = start + ((END - START) / size) - 1;

    /* the last process needs to do all remaining ones */
    if (rank == (size - 1)) {
        end = END;
    }

    /* do the calculation */
    int sum = 0, i;
    for (i = start; i <= end; i++) {
        sum += i;
    }

    /* debugging output */
    printf("Process %d: sum(%d, %d) = %d\n", rank, start, end, sum);

    /* MPI communication: process 0 receives everyone elses sum */
    if (rank == 0) {
        /* parent process: receive each partial sum and add it to ours */
        int partial, i;
        MPI_Status status;
        for (i = 1; i < size; i++) {
            MPI_Recv(&partial, 1, MPI_INT, i, 0, MPI_COMM_WORLD, &status);
            sum += partial;
        }
    } else {
        /* worker process: send sum to process 0 */
        MPI_Send(&sum, 1, MPI_INT, 0, 0, MPI_COMM_WORLD);
    } 

    /* now have process 0 display the results */
    if (rank == 0) {
        printf("The final sum = %d.\n", sum);
    }

    /* quit MPI */
    MPI_Finalize();
    return 0;
}