Commit b291182b authored Aug 25, 2023 by V Donaldson

[compiler-rt][BF16] "bfloat -> float -> bfloat" round-trip conversions

Invoking compiler-rt function __truncsfbf2 to convert a zero 32-bit float
0x00000000 to a 16-bit bfloat value currently generates the denormal value
0x0040, rather than value 0x0000. Negative zero 0x80000000 is converted
to denormal 0x8040 rather than 0x8000.

This behavior is seen in flang code under development (not yet integrated)
that converts bfloat/REAL(KIND=3) argument values to float/REAL(KIND=4)
values and then converts those values back to bfloat/REAL(KIND=3). There
are other instances of the problem. A round-trip type conversion using
__truncsfbf2 of a denormal generates a different denormal, and an sNaN
is converted to a qNaN.

The problem is addressed in generic conversion function fp_trunc_impl.inc
by removing trailing 0 significand bits when the source and destination
type formats are identical except for the significand size. This condition
is met only for float -> bfloat conversions.

Round-trip conversions for at least some other type pairs have the same
problem. A solution in those cases would need to account for exponent
size differences. Those cases are not relevant to flang compilations
and are not addressed here. A broader solution might subsume this fix,
or this fix might remain useful as is.

There are no existing tests of bfloat conversion functionality in the
compiler-rt test directory. Tests for other conversions use a common
infrastructure that does not currently have support for bfloat conversions.
This patch does not attempt to add that infrastructure for this new case.
CodeGen test bfloat.ll checks bfloat adds and other operations that invoke
__truncsfbf2.

parent 074f23e3

Show whitespace changes

Inline Side-by-side

Please to comment