Commit 3a4feb1d authored Jun 22, 2020 by Mikhail Maltsev

[ARM][BFloat] Implement bf16 get/set_lane without casts to i16 vectors

Currently, in order to extract an element from a bf16 vector, we cast
the vector to an i16 vector, perform the extraction, and cast the result to
bfloat. This behavior was copied from the old fp16 implementation.

The goal of this patch is to achieve optimal code generation for lane
copying intrinsics in a subsequent patch (LLVM fails to fold certain
combinations of bitcast, insertelement, extractelement and
shufflevector instructions leading to the generation of suboptimal code).

Differential Revision: https://reviews.llvm.org/D82206

parent 6ae0f5f3

Show whitespace changes

Inline Side-by-side

Please to comment