Reduction

C | Fortran-90

reduction

Definition

The reduction clause indicates that the variables passed are, as its name suggests, used in a reduction. Each implicit task or SIMD lane creates a private copy initialises it to the initialiser value of the reduction identifier, that is, 0 for a sum or 1 for a product to name a few. After the end of the region, the original list item is updated with the values of the private copies using the combiner associated with the reduction identifier.
By default, the reduction computation is complete at the end of the construct. However, if nowait is specified on the construct, this is no longer guaranteed. Indeed, accesses to the original list item will create a data race and, thus, have unspecified effect unless synchronisation ensures that they occur after all threads have executed all of their iterations or section constructs, and the reduction computation has completed and stored the computed value of that list item. This can most simply be ensured through a barrier synchronisation.

Copy

Feedback

reduction([modifier, ]identifier: list)

Parameters

modifier [Optional]

  • default: this is equivalent to passing no modifier. For parallel and worksharing constructs, one or more private copies of each list item are created for each implicit task, as if the private clause had been used. For the simd construct, one or more private copies of each list item are created for each SIMD lane, as if the private clause had been used. For the taskloop construct, private copies are created according to the rules of the reduction scoping clauses. For the teams construct, one or more private copies of each list item are created for the initial task of each team in the league, as if the private clause had been used. For the loop construct, private copies are created and used in the construct according to the description and restrictions in Section 2.19.3 on page 279. At the end of a region that corresponds to a construct for which the reduction clause was specified, the original list item is updated by combining its original value with the final value of each of the private copies, using the combiner of the specified reduction identifier.
  • inscan: a scan computation is performed over updates to the list item performed in each logical iteration of the loop associated with the worksharing-loop, worksharing-loop SIMD, or simd construct. The list items are privatised in the construct according to the description and restrictions in Section 2.19.3 on page 279. At the end of the region, each original list item is assigned the value of the private copy from the last logical iteration of the loops associated with the construct.
  • task: in a parallel or worksharing construct, each list item is privatised according to the description and restrictions in Section 2.19.3 on page 279, and an unspecified number of additional private copies are created to support task reductions. Any copies associated with the reduction are initialised before they are accessed by the tasks that participate in the reduction, which include all implicit tasks in the corresponding region and all participating explicit tasks that specify an in_reduction clause. After the end of the region, the original list item contains the result of the reduction.
identifier

The identifier indicating the reduction operation to apply. It is either a user-defined operator or one of the following operators:

  • +: sum.
  • -: difference.
  • *: product.
  • &: bit-wise and.
  • |: bit-wise or.
  • ^: bit-wise exclusive or.
  • &&: logical and.
  • ||: logical or.
  • min: minimum.
  • max: maximum.
list

The list of reduction variables, separated with commas.

Example

Copy

Feedback

#include <stdio.h>
#include <stdlib.h>
#include <omp.h>

/**
 * @brief Illustrates how to do a classic reduction.
 * @details This example consists in calculating the sum of all elements of an
 * array.
 **/
int main(int argc, char* argv[])
{
    // Use 2 threads when creating OpenMP parallel regions
    omp_set_num_threads(2);

    int total = 0;
    const int ARRAY_SIZE = 10;
    int* myArray = malloc(sizeof(int) * ARRAY_SIZE);
    if(myArray == NULL)
    {
        printf("Cannot allocate the array \"myArray\".\n");
        return EXIT_FAILURE;
    }

    // Initialise the array
    for(int i = 0; i < ARRAY_SIZE; i++)
    {
        myArray[i] = i;
    }

    // Calculate the sum of all elements
    #pragma omp parallel for default(none) shared(myArray) firstprivate(ARRAY_SIZE) reduction(+: total)
    for(int i = 0; i < ARRAY_SIZE; i++)
    {
        total += myArray[i];
    }

    printf("The sum of all array elements is equal to %d.\n", total);
    free(myArray);

    return EXIT_SUCCESS;
}