Skip to content
Commit 0ee250ee authored by Stanislav Mekhanoshin's avatar Stanislav Mekhanoshin
Browse files

[AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies

Codegen prepare sinks comparisons close to a user is we have only one register
for conditions. For AMDGPU we have many SGPRs capable to hold vector conditions.
Changed BE to report we have many condition registers. That way IR LICM pass
would hoist an invariant comparison out of a loop and codegen prepare will not
sink it.

With that done a condition is calculated in one block and used in another.
Current behavior is to store workitem's condition in a VGPR using v_cndmask_b32
and then restore it with yet another v_cmp instruction from that v_cndmask's
result. To mitigate the issue a propagation of source SGPR pair in place of v_cmp
is implemented. Additional side effect of this is that we may consume less VGPRs
at a cost of more SGPRs in case if holding of multiple conditions is needed, and
that is a clear win in most cases.

Differential Revision: https://reviews.llvm.org/D26114

llvm-svn: 288053
parent caaa82d9
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment