Skip to content
Commit 939cf9b1 authored by Simon Pilgrim's avatar Simon Pilgrim
Browse files

[AArch64] Use neon instructions for i64/i128 ISD::PARITY calculation

As noticed on D129765 and reported on Issue #56531 - aarch64 targets can use the neon ctpop + add-reduce instructions to speed up scalar ctpop instructions, but we fail to do this for parity calculations.

I'm not sure where the cutoff should be for specific CPUs, but i64 (+ i128 special case) shows a definite reduction in instruction count. i32 is about the same (but scalar <-> neon transfers are probably more costly?), and sub-i32 promotion looks to be a definite regression compared to parity expansion optimized for those widths.

Differential Revision: https://reviews.llvm.org/D130246
parent 8f0ba6c4
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment