AMDGPU: Change pre-gfx9 implementation of fcanonicalize to mul (89c8c80b) · Commits · Lorenzo Albano / LLVM bpEVL

Commit 89c8c80b authored Apr 22, 2020 by Matt Arsenault

AMDGPU: Change pre-gfx9 implementation of fcanonicalize to mul

If f32 denormals were enabled pre-gfx9, we would still try to
implement this with v_max_f32. Pre-gfx9, these instructions ignored
the denormal mode and did not flush. Switch to the multiply form for
f32 as a workaround which should always work in any case.

This fixes conformance failures when the library implementation of
fmin/fmax were accidentally not inlined, forcing the assumption of no
flushing on targets where denormals are not enabled by default. This
is a workaround, since really we should not be mixing code with
different FP mode expectations, but prefer the lowering that will work
in any mode.

Now this will always use max to implement canonicalize on gfx9+. This
is only really beneficial for f64. For f32/f16 it's a neutral choice
(and worse in terms of code size in 1 case), but possibly worse for
the compiler since it does add an extra register use operand. Leave
this change for later.

parent d987eed9

Expand all Hide whitespace changes

Inline Side-by-side

Please register or to comment