[compiler-rt][BF16] "bfloat -> float -> bfloat" round-trip conversions
Invoking compiler-rt function __truncsfbf2 to convert a zero 32-bit float 0x00000000 to a 16-bit bfloat value currently generates the denormal value 0x0040, rather than value 0x0000. Negative zero 0x80000000 is converted to denormal 0x8040 rather than 0x8000. This behavior is seen in flang code under development (not yet integrated) that converts bfloat/REAL(KIND=3) argument values to float/REAL(KIND=4) values and then converts those values back to bfloat/REAL(KIND=3). There are other instances of the problem. A round-trip type conversion using __truncsfbf2 of a denormal generates a different denormal, and an sNaN is converted to a qNaN. The problem is addressed in generic conversion function fp_trunc_impl.inc by removing trailing 0 significand bits when the source and destination type formats are identical except for the significand size. This condition is met only for float -> bfloat conversions. Round-trip conversions for at least some other type pairs have the same problem. A solution in those cases would need to account for exponent size differences. Those cases are not relevant to flang compilations and are not addressed here. A broader solution might subsume this fix, or this fix might remain useful as is. There are no existing tests of bfloat conversion functionality in the compiler-rt test directory. Tests for other conversions use a common infrastructure that does not currently have support for bfloat conversions. This patch does not attempt to add that infrastructure for this new case. CodeGen test bfloat.ll checks bfloat adds and other operations that invoke __truncsfbf2.
Loading
Please sign in to comment