Skip to content
Commit 2005ae15 authored by Simon Pilgrim's avatar Simon Pilgrim
Browse files

[X86][SLM] WriteVecIMul instructions only take 1uop (REAPPLIED)

The xmm variant have half the throughput (and +1cy latency) of the mmx variants, but are still 1uop.

I still need to do more thorough testing of SLM on test-suite before fixing the obvious bad numbers for WritePMULLD.

But this helps the D103695 helper script get to more accurate numbers for vXi32 multiplies of extended operands (i.e. we can use PMADDWD, PMULLW/PMULHW etc). Matches what Intel AoM / Agner / llvm-exegesis reports.
parent ac51d692
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment