[X86] Lower the cost of avx512 horizontal bool and/or reductions to... (103968d1) · Commits · Lorenzo Albano / LLVM bpEVL

Commit 103968d1 authored Nov 04, 2019 by Craig Topper

[X86] Lower the cost of avx512 horizontal bool and/or reductions to...

[X86] Lower the cost of avx512 horizontal bool and/or reductions to 2*log2(bitwidth)+1 for legal types.

This better represents the kshift+binop we'd get for each stage
before the final extract. Its likely we'll do even better by
doing a kmov and a cmp with a GPR, but this is a good start.

The default handling was costing a worst case single source
permute shuffle of the vector before the binop. This worst
case assumes the shuffle might have to be emulated with
extracts and inserts. But since we know we're doing a reduction
we can assume we'll get kshift lowering.

There's still some room for improvement here, but this is
much better than it was.

parent 58acbce3

Hide whitespace changes

Inline Side-by-side

Please register or to comment