Commit e81bfbad authored Sep 21, 2014 by Chandler Carruth

[x86] Teach the new vector shuffle lowering of v4f64 to prefer a direct

VBLENDPD over using VSHUFPD. While the 256-bit variant of VBLENDPD slows
down to the same speed as VSHUFPD on Sandy Bridge CPUs, it has twice the
reciprocal throughput on Ivy Bridge CPUs much like it does everywhere
for 128-bits. There isn't a downside, so just eagerly use this
instruction when it suffices.

llvm-svn: 218208

parent 6aea21df

Show whitespace changes

Inline Side-by-side

Please to comment