AMDGPU: Directly use rcp intrinsic in idiv expansions
Since natural fdiv lowering is now more conservative even with denormals disabled, we get a slower expansion from just a plain 1.0/fdiv. Directly emit the rcp intrinsic when using it to implement integer division to avoid a pointlessly complex sequence.
Showing
- llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp 7 additions, 2 deletionsllvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
- llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-fold-binop-select.ll 2 additions, 2 deletions...CodeGen/AMDGPU/amdgpu-codegenprepare-fold-binop-select.ll
- llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll 72 additions, 72 deletionsllvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll
- llvm/test/CodeGen/AMDGPU/divrem24-assume.ll 1 addition, 1 deletionllvm/test/CodeGen/AMDGPU/divrem24-assume.ll
Loading
Please register or sign in to comment