AMDGPU: Optimize outgoing workitem ID based on reqd_work_group_size
If we know we we aren't using a component from the kernel, we can save a few bit packing instructions. We're still enabling the VGPR input to the kernel though.
Loading
Please sign in to comment