Skip to content
Commit f6c8a8e9 authored by Pravin Jagtap's avatar Pravin Jagtap
Browse files

[AMDGPU] Iterative scan implementation for atomic optimizer.

This patch provides an alternative implementation to DPP for Scan Computations.

An alternative implementation iterates over all active lanes of Wavefront
using llvm.cttz and performs the following steps:
    1.  Read the value that needs to be atomically incremented using
        llvm.amdgcn.readlane intrinsic
    2.  Accumulate the result.
    3.  Update the scan result using llvm.amdgcn.writelane intrinsic
        if intermediate scan results are needed later in the kernel.

Reviewed By: arsenm, cdevadas

Differential Revision: https://reviews.llvm.org/D147408
parent a90e3509
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment