Skip to content
Unverified Commit 43531e71 authored by Durgadoss R's avatar Durgadoss R Committed by GitHub
Browse files

[LLVM][NVPTX] Add cp.async.bulk.commit/wait intrinsics (#78698)

This patch adds NVVM intrinsics and NVPTX codegen for the bulk variants
of the async-copy commit/wait instructions.
lit tests are added to verify the generated PTX.

PTX Doc link:

https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-cp-async-bulk-commit-group



Signed-off-by: default avatarDurgadoss R <durgadossr@nvidia.com>
parent 42b16035
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment