[AArch64] Add more efficient bitwise vector reductions.
Improves the codegen for VECREDUCE_{AND,OR,XOR} operations on AArch64. Currently, these are fully scalarized, except if the vector is a <N x i1>. This patch improves the codegen down to O(log(N)) where N is the length of the vector for vectors whose elements are not i1, by repeatedly applying the bitwise operations to the two halves of the vector. <N x i1> bitwise reductions are handled using VECREDUCE_{UMAX,UMIN,ADD} instead. I had to update quite a few codegen tests with these changes, with a general downward trend in instruction count. Since the vector reductions already have tests, I haven't added any new tests myself. Differential Revision: https://reviews.llvm.org/D148185
Showing
- llvm/lib/Target/AArch64/AArch64ISelLowering.cpp 135 additions, 0 deletionsllvm/lib/Target/AArch64/AArch64ISelLowering.cpp
- llvm/test/CodeGen/AArch64/dag-combine-setcc.ll 16 additions, 12 deletionsllvm/test/CodeGen/AArch64/dag-combine-setcc.ll
- llvm/test/CodeGen/AArch64/double_reduct.ll 9 additions, 9 deletionsllvm/test/CodeGen/AArch64/double_reduct.ll
- llvm/test/CodeGen/AArch64/illegal-floating-point-vector-compares.ll 5 additions, 3 deletions...CodeGen/AArch64/illegal-floating-point-vector-compares.ll
- llvm/test/CodeGen/AArch64/reduce-and.ll 44 additions, 85 deletionsllvm/test/CodeGen/AArch64/reduce-and.ll
- llvm/test/CodeGen/AArch64/reduce-or.ll 46 additions, 84 deletionsllvm/test/CodeGen/AArch64/reduce-or.ll
- llvm/test/CodeGen/AArch64/reduce-xor.ll 46 additions, 84 deletionsllvm/test/CodeGen/AArch64/reduce-xor.ll
- llvm/test/CodeGen/AArch64/sve-fixed-length-log-reduce.ll 12 additions, 12 deletionsllvm/test/CodeGen/AArch64/sve-fixed-length-log-reduce.ll
- llvm/test/CodeGen/AArch64/sve-fixed-length-ptest.ll 6 additions, 12 deletionsllvm/test/CodeGen/AArch64/sve-fixed-length-ptest.ll
- llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-ptest.ll 3 additions, 3 deletions.../CodeGen/AArch64/sve-streaming-mode-fixed-length-ptest.ll
- llvm/test/CodeGen/AArch64/vecreduce-and-legalization.ll 27 additions, 28 deletionsllvm/test/CodeGen/AArch64/vecreduce-and-legalization.ll
Loading
Please register or sign in to comment