[x86] Teach the decomposed shuffle/blend lowering to use an early blend
when that will allow it to lower with a single permute instead of multiple permutes. It tries to detect when it will only have to do a single permute in either case to maximize folding of loads and such. This cuts a *lot* of the avx2 shuffle permute counts in half. =] llvm-svn: 229309
Loading
Please sign in to comment