Skip to content
Commit 9f551ad6 authored by Simon Pilgrim's avatar Simon Pilgrim
Browse files

[X86][SSE] Aggressively use PMADDWD for v4i32 multiplies with 17 or more leading zeros

As discussed in D41484, PMADDWD for 'zero extended' vXi32 is nearly always a better option than PMULLD:
On SNB it will result in code that isn't any faster, but not any slower so we may as well keep it.
On KNL it only has half the throughput, so I've disabled it on there - ideally there'd be a better way than this.

Differential Revision: https://reviews.llvm.org/D42258

llvm-svn: 323367
parent a9263c89
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment