Skip to content
Commit 103968d1 authored by Craig Topper's avatar Craig Topper
Browse files

[X86] Lower the cost of avx512 horizontal bool and/or reductions to...

[X86] Lower the cost of avx512 horizontal bool and/or reductions to 2*log2(bitwidth)+1 for legal types.

This better represents the kshift+binop we'd get for each stage
before the final extract. Its likely we'll do even better by
doing a kmov and a cmp with a GPR, but this is a good start.

The default handling was costing a worst case single source
permute shuffle of the vector before the binop. This worst
case assumes the shuffle might have to be emulated with
extracts and inserts. But since we know we're doing a reduction
we can assume we'll get kshift lowering.

There's still some room for improvement here, but this is
much better than it was.
parent 58acbce3
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment