Skip to content
Unverified Commit dfab31b4 authored by Jakub Chlanda's avatar Jakub Chlanda Committed by GitHub
Browse files

[NVPTX] Add support for maxclusterrank in launch_bounds (#66496)

Since SM_90 CUDA supports specifying additional argument to the
launch_bounds attribute: maxBlocksPerCluster, to express the maximum
number of CTAs that can be part of the cluster. See:
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#cluster-dimension-directives-maxclusterrank
and

https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#launch-bounds
for details.
parent d1653c8e
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment