[X86][AVX] Only shuffle the lower half of vectors if the upper half is undefined
First step towards making better use of AVX's implicit zeroing of the upper half of a 256-bit vector by instructions that only act on the lower 128-bit vector - discussed on D14151. As well as the fact that 128-bit shuffle instructions are generally more capable, this can be performant for older CPUs with 128-bit ALUs (e.g. Jaguar, Sandy Bridge) that must treat 256-bit vectors as multiple micro-ops. Moved the similar subvector extraction shuffle combines from PerformShuffleCombine256 to lowerVectorShuffle as well. Note: I've avoided combining shuffles that reference elements from the upper halves of the input vectors - this may be reviewed in future work as well (AVX1 would probably always gain, but AVX2 does have some cross-lane shuffle instructions). Differential Revision: http://reviews.llvm.org/D15477 llvm-svn: 256332
Loading
Please sign in to comment