Unverified Commit 4101c7bf authored Nov 10, 2021 by Roman Lebedev

[X86][Costmodel] `getReplicationShuffleCost()`: implement cost model for 32/64...

[X86][Costmodel] `getReplicationShuffleCost()`: implement cost model for 32/64 bit-wide elements with AVX512F

This models lowering to `vpermd`/`vpermq`/`vpermps`/`vpermpd`,
that take a single input vector and a single index vector,
and are cross-lane. So far i haven't seen evidence that
replication ever results in demanding more than a single
input vector per output vector.

This results in *shockingly* lesser costs :)

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D113350

parent bef966eb

Expand all Show whitespace changes

Inline Side-by-side

Please to comment