Commit f0f0d273 authored Feb 19, 2015 by Chandler Carruth

[x86] Simplify the pre-SSSE3 v16i8 lowering significantly by decomposing

them into permutes and a blend with the generic decomposition logic.

This works really well in almost every case and lets the code only
manage the expansion of a single input into two v8i16 vectors to perform
the actual shuffle. The blend-based merging is often much nicer than the
pack based merging that this replaces. The only place where it isn't we
end up blending between two packs when we could do a single pack. To
handle that case, just teach the v2i64 lowering to handle these blends
by digging out the operands.

With this we're down to only really random permutations that cause an
explosion of instructions.

llvm-svn: 229849

parent c31da701

Show whitespace changes

Inline Side-by-side

Please to comment