[x86] Teach the new vector shuffle lowering a fancier way to lower
256-bit vectors with lane-crossing. Rather than immediately decomposing to 128-bit vectors, try flipping the 256-bit vector lanes, shuffling them and blending them together. This reduces our worst case shuffle by a pretty significant margin across the board. llvm-svn: 218446
Loading
Please register or sign in to comment