- Jul 29, 2011
-
-
Bruno Cardoso Lopes authored
generation to always catch the weird cases. llvm-svn: 136453
-
Bruno Cardoso Lopes authored
llvm-svn: 136452
-
Bruno Cardoso Lopes authored
undef mask elements. This fixes PR10529. llvm-svn: 136450
-
Bruno Cardoso Lopes authored
Also tidy up code a bit. llvm-svn: 136449
-
Bruno Cardoso Lopes authored
Also make PALIGNR masks to don't match 256-bits, which isn't supported It's also a step to solve PR10489 llvm-svn: 136448
-
- Jul 28, 2011
-
-
Bruno Cardoso Lopes authored
llvm-svn: 136324
-
Bruno Cardoso Lopes authored
a convert pattern close to the instruction definition. llvm-svn: 136320
-
Eli Friedman authored
llvm-svn: 136283
-
- Jul 27, 2011
-
-
Jeffrey Yasskin authored
C++0x. llvm-svn: 136211
-
Bruno Cardoso Lopes authored
llvm-svn: 136201
-
Bruno Cardoso Lopes authored
usage of the shuffle bitmask. Both work in 128-bit lanes without crossing, but in the former the mask of the high part is the same used by the low part while in the later both lanes have independent masks. Handle this properly and and add support for vpermilpd. llvm-svn: 136200
-
Benjamin Kramer authored
On x86 we can't encode an immediate LHS of a sub directly. If the RHS comes from a XOR with a constant we can fold the negation into the xor and add one to the immediate of the sub. Then we can turn the sub into an add, which can be commuted and encoded efficiently. This code is generated for __builtin_clz and friends. llvm-svn: 136167
-
Bruno Cardoso Lopes authored
different from the previous 128-bit because they work in lanes. Update a few comments and add testcases llvm-svn: 136157
-
- Jul 26, 2011
-
-
Eli Friedman authored
Prevent x86-specific DAGCombine from creating nodes with illegal type (which could not be selected). Fixes a minor isel issue that was breaking the testcase from r136130. llvm-svn: 136148
-
Bruno Cardoso Lopes authored
support for 256-bit versions (but no instruction selection yet, coming next). llvm-svn: 136050
-
Bruno Cardoso Lopes authored
llvm-svn: 136049
-
Bruno Cardoso Lopes authored
This also fixes PR10452 llvm-svn: 136004
-
Bruno Cardoso Lopes authored
shuffle before inserting on a 256-bit vector. - Add AVX versions of movd/movq instructions - Introduce a few COPY patterns to match insert_subvector instructions. This turns a trivial insert_subvector instruction into a register copy, coalescing the xmm into a ymm and avoid emiting on more instruction. llvm-svn: 136002
-
Bruno Cardoso Lopes authored
native 256-bit vector instruction to do scalar_to_vector. llvm-svn: 136001
-
- Jul 25, 2011
-
-
Eli Friedman authored
Addresses PR10466, although the crash from that PR only triggers in cases where DAGCombine misses optimizing a shuffle. llvm-svn: 135980
-
- Jul 22, 2011
-
-
Rafael Espindola authored
too. Patch by Jeff Muizelaar. llvm-svn: 135789
-
Dan Gohman authored
of doing the RAUW calls for the overflow value itself. This makes it more consistent with how the rest of LegalizeDAG works. llvm-svn: 135788
-
Benjamin Kramer authored
Remove the escaped newline. llvm-svn: 135739
-
Bruno Cardoso Lopes authored
the way to go. Doing this here will prevent several node matches later, and would have to force looking all the way through several VINSERTF128/VEXTRACTF128 chains to optimize simple things. llvm-svn: 135730
-
Bruno Cardoso Lopes authored
and was actually very wrong, fix it and make it simpler. Also remove the ConcatVectors function, which is unused now. - Fix a introduction of useless nodes in r126664 and r126264. The VUNPCKL* should never be introduced cause we don't want duplicate nodes for 128 AVX and non-AVX modes, the actual instruction difference only exists during isel, but not for target specific DAG nodes. We only introduce V* target nodes when there is no 128-bit version already there. - Fix a fragile test and make it more useful. llvm-svn: 135729
-
Bruno Cardoso Lopes authored
vxorps + vinsertf128 pair of instructions llvm-svn: 135727
-
Bruno Cardoso Lopes authored
direclty supported and should be promoted and handled by smaller shuffles llvm-svn: 135726
-
Bruno Cardoso Lopes authored
llvm-svn: 135725
-
- Jul 21, 2011
-
-
Bruno Cardoso Lopes authored
- Add more bitcasts for v16i16 - Since 135661 and 135662 already added the splat logic, just add one more splat test for v16i16 llvm-svn: 135663
-
Bruno Cardoso Lopes authored
instruction introduced in AVX, which can operate on 128 and 256-bit vectors. It considers a 256-bit vector as two independent 128-bit lanes. It can permute any 32 or 64 elements inside a lane, and restricts the second lane to have the same permutation of the first one. With the improved splat support introduced early today, adding codegen for this instruction enable more efficient 256-bit code: Instead of: vextractf128 $0, %ymm0, %xmm0 punpcklbw %xmm0, %xmm0 punpckhbw %xmm0, %xmm0 vinsertf128 $0, %xmm0, %ymm0, %ymm1 vinsertf128 $1, %xmm0, %ymm1, %ymm0 vextractf128 $1, %ymm0, %xmm1 shufps $1, %xmm1, %xmm1 movss %xmm1, 28(%rsp) movss %xmm1, 24(%rsp) movss %xmm1, 20(%rsp) movss %xmm1, 16(%rsp) vextractf128 $0, %ymm0, %xmm0 shufps $1, %xmm0, %xmm0 movss %xmm0, 12(%rsp) movss %xmm0, 8(%rsp) movss %xmm0, 4(%rsp) movss %xmm0, (%rsp) vmovaps (%rsp), %ymm0 We get: vextractf128 $0, %ymm0, %xmm0 punpcklbw %xmm0, %xmm0 punpckhbw %xmm0, %xmm0 vinsertf128 $0, %xmm0, %ymm0, %ymm1 vinsertf128 $1, %xmm0, %ymm1, %ymm0 vpermilps $85, %ymm0, %ymm0 llvm-svn: 135662
-
Bruno Cardoso Lopes authored
refactor the code and add a bunch of comments. The final shuffle emitted by handling 256-bit types is suitable for the VPERM shuffle instruction which is going to be introduced in a next commit (with a testcase which cover this commit) llvm-svn: 135661
-
Bruno Cardoso Lopes authored
llvm-svn: 135656
-
- Jul 20, 2011
-
-
Evan Cheng authored
There is still a bit more refactoring left to do in Targets. But we are now very close to fixing all the layering issues in MC. llvm-svn: 135611
-
- Jul 18, 2011
-
-
Evan Cheng authored
to MCRegisterInfo. Also initialize the mapping at construction time. This patch eliminate TargetRegisterInfo from TargetAsmInfo. It's another step towards fixing the layering violation. llvm-svn: 135424
-
Chris Lattner authored
llvm-svn: 135375
-
- Jul 16, 2011
-
-
Bruno Cardoso Lopes authored
1) Make non-legal 256-bit loads to be promoted to v4i64. This lets us canonize the loads and handle things the same way we use to handle for 128-bit registers. Despite of what one of the removed comments explained, the load promotion would not mess with VPERM, it's only a matter of doing the appropriate bitcasts when this instructions comes to be introduced. Also make LOAD v8i32 legal. 2) Doing 1) exposed two bugs: - v4i64 was being promoted to itself for several opcodes (introduced in r124447 by David Greene) causing endless recursion and the stack to explode. - there was no support for allOnes BUILD_VECTORs and ANDNP would fail to match because it was generating early target constant pools during lowering. 3) The testcases are already checked-in, doing 1) exposed the bugs in the current testcases. 4) Tidy up code to be more clear and explicit about AVX. llvm-svn: 135313
-
- Jul 14, 2011
-
-
Eric Christopher authored
when determining validity of matching constraint. Allow i1 types access to the GR8 reg class for x86. Fixes PR10352 and rdar://9777108 llvm-svn: 135180
-
Nadav Rotem authored
[VECTOR-SELECT] During type legalization we often use the SIGN_EXTEND_INREG SDNode. When this SDNode is legalized during the LegalizeVector phase, it is scalarized because non-simple types are automatically marked to be expanded. In this patch we add support for lowering SIGN_EXTEND_INREG manually. This fixes CodeGen/X86/vec_sext.ll when running with the '-promote-elements' flag. llvm-svn: 135144
-
- Jul 13, 2011
-
-
Bruno Cardoso Lopes authored
general version of X86ISD::ANDNP also opened the room for a little bit of refactoring. llvm-svn: 135088
-
Bruno Cardoso Lopes authored
it's later selected to a ANDNPD/ANDNPS instruction instead of the PANDN instruction. Rename it. llvm-svn: 135087
-