Skip to content
Commit 0adda1e4 authored by Chandler Carruth's avatar Chandler Carruth
Browse files

[x86] Adjust the patterns for lowering X86vzmovl nodes which don't

perform a load to use blendps rather than movss when it is available.

For non-loads, blendps is *much* faster. It can execute on two ports in
Sandy Bridge and Ivy Bridge, and *three* ports on Haswell. This fixes
one of the "regressions" from aggressively taking the "insertion" path
in the new vector shuffle lowering.

This does highlight one problem with blendps -- it isn't commuted as
heavily as it should be. That's future work though.

llvm-svn: 219022
parent 0aca4b1a
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment