Rookie HPC

Collectives

Non-blocking

Reduction

C | FORTRAN | FORTRAN-2008

MPI_Iallreduce

Definition

MPI_Iallreduce is the non-blocking version of MPI_Allreduce; it is the means by which MPI processes can apply a reduction calculation and make the reduction result available to all MPI processes involved. Unlike MPI_Allreduce however, MPI_Iallreduce returns immediately, before the reduction is guaranteed to be complete. The user must therefore explicitly wait (MPI_Wait) or test (MPI_Test) for the completion of MPI_Iallreduce before safely reusing the buffers passed. Also, MPI_Iallreduce is a collective operation; it must be called by every MPI process in the communicator given. Predefined operations are: MPI_MIN, MPI_MAX, MPI_BOR, MPI_BXOR, MPI_LOR, MPI_LXOR, MPI_BAND, MPI_LAND, MPI_SUM and MPI_PROD. Other variants of MPI_Iallreduce are MPI_Allreduce, MPI_Reduce, MPI_Ireduce. Refer to MPI_Allreduce to see the blocking counterpart of MPI_Iallreduce.

Copy

Feedback

int MPI_Iallreduce(const void* send_buffer,
                   void* receive_buffer,
                   int count,
                   MPI_Datatype datatype,
                   MPI_Op operation,
                   MPI_Comm communicator,
                   MPI_Request* request);

Parameters

send_buffer
A pointer on the buffer to send for reduction.
receive_buffer
A pointer on the buffer in which store the result of the reduction.
count
The number of elements in the send buffer, which is identical to that in the receive buffer as well.
datatype
The type of a buffer element.
operation
The operation to apply to combine messages received in the reduction. This operation must be associative, and commutative for predefined operations while user-defined operations may be non-commutative.
communicator
The communicator in which the reduction takes place.
request
The variable in which store the handler on the non-blocking operation.

Returned value

The error code returned from the non-blocking reduction.

MPI_SUCCESS
The routine successfully completed.

Example

Copy

Feedback

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

/**
 * @brief Illustrates how to use a non-blocking all-reduce.
 * @details This application consists of a sum all-reduction; every MPI process
 * sends its rank for reduction before the sum of these ranks is stored in the
 * receive buffer of each MPI process. It can be visualised as follows:
 *
 * +-----------+ +-----------+ +-----------+ +-----------+
 * | Process 0 | | Process 1 | | Process 2 | | Process 3 |
 * +-+-------+-+ +-+-------+-+ +-+-------+-+ +-+-------+-+
 *   | Value |     | Value |     | Value |     | Value |
 *   |   0   |     |   1   |     |   2   |     |   3   |
 *   +-------+     +----+--+     +--+----+     +-------+
 *            \         |           |         /
 *             \        |           |        /
 *              \       |           |       /
 *               \      |           |      /
 *                +-----+-----+-----+-----+
 *                            |
 *                        +---+---+
 *                        |  SUM  |
 *                        +---+---+
 *                        |   6   |
 *                        +-------+
 *                            |
 *                +-----+-----+-----+-----+
 *               /      |           |      \
 *              /       |           |       \
 *             /        |           |        \
 *            /         |           |         \
 *   +-------+     +----+--+     +--+----+     +-------+  
 *   |   6   |     |   6   |     |   6   |     |   6   |  
 * +-+-------+-+ +-+-------+-+ +-+-------+-+ +-+-------+-+
 * | Process 0 | | Process 1 | | Process 2 | | Process 3 |
 * +-----------+ +-----------+ +-----------+ +-----------+
 **/
int main(int argc, char* argv[])
{
    MPI_Init(&argc, &argv);

    // Get the size of the communicator
    int size = 0;
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    if(size != 4)
    {
        printf("This application is meant to be run with 4 MPI processes.\n");
        MPI_Abort(MPI_COMM_WORLD, EXIT_FAILURE);
    }

    // Get my rank
    int my_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);

    // Each MPI process sends its rank to reduction, root MPI process collects the result
    int reduction_result = 0;
    MPI_Request request;
    MPI_Iallreduce(&my_rank, &reduction_result, 1, MPI_INT, MPI_SUM, MPI_COMM_WORLD, &request);

    // Do some other job
    printf("Process %d issued the MPI_Iallreduce and has moved on, printing this message.\n", my_rank);

    // Wait for the MPI_Iallreduce to complete
    MPI_Wait(&request, MPI_STATUS_IGNORE);

    printf("[MPI Process %d] The sum of all ranks is %d.\n", my_rank, reduction_result);

    MPI_Finalize();

    return EXIT_SUCCESS;
}