AMDGPU: Fix computation for getOccupancyWithLocalMemSize
The computation here didn't really make sense to me, and reported wildy different results depending on the flat work group size attribute. I think this should really report a range derived from the possible work group size bounds, and only allow an occupancy that is a multiple of the group size.
Loading
Please sign in to comment