Skip to content
Commit 06e08f0b authored by Gheorghe-Teodor Bercea's avatar Gheorghe-Teodor Bercea
Browse files

[OpenMP][libomptarget] New reduction scheme for team reductions

Summary:
This patch adds a more sophisticated team reduction scheme to the OpenMP libomptarget-nvptx runtime.

The scheme uses a fixed size global memory buffer whose length can be adjusted via compiler flag:
```
-fopenmp-cuda-teams-reduction-recs-num=1024
```
The global buffer is a structure of arrays (with default size of 1024 each and controlled by the above flag), one array for each reduction variable.

Values in the buffer are processed by the last team to finish executing the body of the target region.

In addition to adding support for the new flag, the compiler also emits special functions used for the reduction of the intermediate reduction values. These changes will be added in a separate compiler patch following this one.




Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: guansong, jfb, jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D58409

llvm-svn: 354471
parent d882ad5e
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment