[OpenMP] Support 'omp_get_num_procs' on the device (#65501)
Summary: The `omp_get_num_procs()` function should return the amount of parallelism availible. On the GPU, this was not defined. We have elected to define this function as the maximum amount of wavefronts / warps that can be simultaneously resident on the device. For AMDGPU this is the number of CUs multiplied byth CU's per wave. For NVPTX this is the maximum threads per SM divided by the warp size and multiplied by the number of SMs.
Loading
Please sign in to comment