Skip to content
Unverified Commit 0e71ae6d authored by Roman Lebedev's avatar Roman Lebedev
Browse files

[X86][Costmodel] Load/store i8 Stride=4 VF=16 interleaving costs

While we already model this tuple, the values are divergent from reality, so fix them.

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/TrGW7cKsE - for intels `Block RThroughput: =24.0`; for ryzens, `Block RThroughput: <=12.0`
So pick cost of `24`.

For store we have:
https://godbolt.org/z/Mh7qaqEfe - for intels `Block RThroughput: =8.0`; for ryzens, `Block RThroughput: <=4.0`
So pick cost of `8`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D110970
parent 74e4a0e3
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment