Rookie HPC

Collectives

Reduction

C | FORTRAN-legacy | FORTRAN-2008

MPI_Reduce_scatter

Definition

MPI_Reduce_scatter is the means by which MPI processes can apply a reduction followed by a variable scatter, which may trigger extra optimisations compared to manually issuing an MPI_Reduce followed by an MPI_Scatterv. Unlike MPI_Reduce_scatter_block, MPI_Reduce_scatter allows blocks to be of different lengths. MPI_Reduce_scatter is a collective operation; it must be called by every MPI process in the communicator given. Predefined operations are: MPI_MIN, MPI_MAX, MPI_BOR, MPI_BXOR, MPI_LOR, MPI_LXOR, MPI_BAND, MPI_LAND, MPI_SUM and MPI_PROD. Other variants of MPI_Reduce_scatter are MPI_Reduce_scatter_block, MPI_Ireduce_scatter_block, MPI_Ireduce_scatter. Refer to MPI_Ireduce_scatter to see the non-blocking counterpart of MPI_Reduce_scatter.

Copy

Feedback

int MPI_Reduce_scatter(const void* send_buffer,
                       void* receive_buffer,
                       int* counts,
                       MPI_Datatype datatype,
                       MPI_Op operation,
                       MPI_Comm communicator);

Parameters

send_buffer
A pointer on the buffer to send for reduction.
receive_buffer
A pointer on the buffer in which store the result of the reduction.
counts
The array containing the number of elements per block in the send buffer, which is identical to that in the receive buffer as well.
datatype
The type of a buffer element.
operation
The operation to apply to combine messages received in the reduction. This operation must be associative, and commutative for predefined operations while user-defined operations may be non-commutative.
communicator
The communicator in which the reduction takes place.

Returned value

The error code returned from the reduction.

MPI_SUCCESS
The routine successfully completed.

Example

Copy

Feedback

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

/**
 * @brief Illustrates how to use a reduce scatter block.
 * @details This application is meant to be run with 3 MPI processes. It
 * consists of a sum reduction; every MPI process has four values to send for
 * reduction. The first values from the MPI process will be reduced and stored 
 * on the MPI process 0. The second and third values will be reduced separately 
 * and stored on MPI process 1, similarly with the fourth values on MPI process
 * 2. It can be visualised as follows:
 *
 *      +---------------+  +---------------+  +---------------+
 *      |   Process 0   |  |   Process 1   |  |   Process 2   |
 *      +---------------+  +---------------+  +---------------+
 *      |     Values    |  |     Values    |  |     Values    |
 *      +---+---+---+---+  +---+---+---+---+  +---+---+---+---+
 *      | 0 | 1 | 2 | 3 |  | 4 | 5 | 6 | 7 |  | 8 | 9 | 10| 11|
 *      +---+---+---+---+  +---+---+---+---+  +---+---+---+---+
 *        |    \   \   \     /  |     |  \      /   /   /   |
 *        | ____\___\___\___/___|_____|___\____/   /   /    |
 *        |/     \   \   \      |     |    \      /   /     |
 *        |       \___\___\____ | ____|_____\____/   /      |
 *        |            \   \   \|/    |      \      /       |
 *        |             \___\___|____ | ______\____/        |
 *        |                  \  |    \|/       \            |
 *        |                   \_|_____|_________\__________ |
 *        |                     |     |                    \|
 *        |                     |     |                     |
 *     +--+--+                +-+---+-+---+              +--+--+
 *     | SUM |                | SUM | SUM |              | SUM |
 *     +-----+                +-----+-----+              +-----+
 *     |  12 |                |  15 |  18 |              |  21 |
 *     +--+--+                +--+--+--+--+              +--+--+
 *        |                      |     |                    |      
 *  +-----+-----+             +--+-----+--+           +-----+-----+
 *  | Process 0 |             | Process 1 |           | Process 2 |
 *  +-----------+             +-----------+           +-----------+
 *  |   Value   |             |   Values  |           |   Value   |
 *  +-----------+             +-----+-----+           +-----------+
 *  |     13    |             |  16 |  20 |           |     23    |
 *  +-----------+             +-----+-----+           +-----------+
 **/
int main(int argc, char* argv[])
{
    MPI_Init(&argc, &argv);

    // Get the size of the communicator
    int size = 0;
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    if(size != 3)
    {
        printf("This application is meant to be run with 3 MPI processes.\n");
        MPI_Abort(MPI_COMM_WORLD, EXIT_FAILURE);
    }

    // Get my rank
    int my_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);

    // Defines my values
    int values[4] = {4 * my_rank, 4 * my_rank + 1, 4 * my_rank + 2, 4 * my_rank + 3};

    // Define the block lengths
    int counts[3] = {1, 2, 1};

    if(my_rank == 1)
    {
        // Each MPI process sends its values and the buffer to receive the corresponding reduction results
        int reduction_results[2];
        MPI_Reduce_scatter(values, reduction_results, counts, MPI_INT, MPI_SUM, MPI_COMM_WORLD);
        printf("[MPI process %d] The sum I received are %d and %d.\n", my_rank, reduction_results[0], reduction_results[1]);
    }
    else
    {
        // Each MPI process sends its values and the buffer to receive the corresponding reduction results
        int reduction_result;
        MPI_Reduce_scatter(values, &reduction_result, counts, MPI_INT, MPI_SUM, MPI_COMM_WORLD);
        printf("[MPI process %d] The sum I received is %d.\n", my_rank, reduction_result);
    }

    MPI_Finalize();

    return EXIT_SUCCESS;
}