- Aug 22, 2011
-
-
Bruno Cardoso Lopes authored
avoding scalarization of the compare. Reduces code from 59 to 6 instructions. Fix PR10712. llvm-svn: 138271
-
Bruno Cardoso Lopes authored
llvm-svn: 138270
-
- Aug 20, 2011
-
-
Bruno Cardoso Lopes authored
a bug and add a testcase! llvm-svn: 138123
-
- Aug 19, 2011
-
-
Craig Topper authored
llvm-svn: 138034
-
Bruno Cardoso Lopes authored
implementation! llvm-svn: 138029
-
Bruno Cardoso Lopes authored
instead of 2. They were already defined this way in their regular version, but not for the intrinsics versions (*_Int), and that would work for assembly emission but not for object code, since a MachineOperand would be missing. This commit fix PR10697. Also removed the {VSQRT,VRSQRT,VRCP}r_Int forms and match the intrinsic via INSERT_SUBREG+EXTRACT_SUBREG patterns. The same couldn't be done for memory versions because sse_load_f32/sse_load_f64 operand need special handling and don't work like regular "addr" operands. There are right now 114 "*_Int" and 98 "Int_*" forms! I'm slowly removing them as I step through, but hope we can get rid of these someday, they are really annoying :) llvm-svn: 138012
-
- Aug 18, 2011
-
-
Bruno Cardoso Lopes authored
v2i64 llvm-svn: 137919
-
Bruno Cardoso Lopes authored
shift amount is variable llvm-svn: 137885
-
- Aug 17, 2011
-
-
Owen Anderson authored
Allow the MCDisassembler to return a "soft fail" status code, indicating an instruction that is disassemblable, but invalid. Only used for ARM UNPREDICTABLE instructions at the moment. Patch by James Molloy. llvm-svn: 137830
-
Bruno Cardoso Lopes authored
match splats in the form (splat (scalar_to_vector (load ...))) whenever the load can be folded. All the logic and instruction emission is working but because of PR8156, there are no ways to match loads, cause they can never be folded for splats. Thus, the tests are XFAILed, but I've tested and exercised all the logic using a relaxed version for checking the foldable loads, as if the bug was already fixed. This should work out of the box once PR8156 gets fixed since MayFoldLoad will work as expected. llvm-svn: 137810
-
Bruno Cardoso Lopes authored
llvm-svn: 137808
-
Bruno Cardoso Lopes authored
vinsertf128 $1 + vpermilps $0, remove the old code that used to first do the splat in a 128-bit vector and then insert it into a larger one. This is better because the handling code gets simpler and also makes a better room for the upcoming vbroadcast! llvm-svn: 137807
-
- Aug 16, 2011
-
-
Bruno Cardoso Lopes authored
there is no support for native 256-bit shuffles, be more smart in some cases, for example, when you can extract specific 128-bit parts and use regular 128-bit shuffles for them. Example: For this shuffle: shufflevector <4 x i64> %a, <4 x i64> %b, <4 x i32> <i32 1, i32 0, i32 7, i32 6> This was expanded to: vextractf128 $1, %ymm1, %xmm2 vpextrq $0, %xmm2, %rax vmovd %rax, %xmm1 vpextrq $1, %xmm2, %rax vmovd %rax, %xmm2 vpunpcklqdq %xmm1, %xmm2, %xmm1 vpextrq $0, %xmm0, %rax vmovd %rax, %xmm2 vpextrq $1, %xmm0, %rax vmovd %rax, %xmm0 vpunpcklqdq %xmm2, %xmm0, %xmm0 vinsertf128 $1, %xmm1, %ymm0, %ymm0 ret Now we get: vshufpd $1, %xmm0, %xmm0, %xmm0 vextractf128 $1, %ymm1, %xmm1 vshufpd $1, %xmm1, %xmm1, %xmm1 vinsertf128 $1, %xmm1, %ymm0, %ymm0 llvm-svn: 137733
-
Bruno Cardoso Lopes authored
also add the AVX versions of the 128-bit patterns llvm-svn: 137685
-
Bruno Cardoso Lopes authored
predicate and TB encoding fields. This fix the encoding for the attached testcase. This fixes PR10625. llvm-svn: 137684
-
Jim Grosbach authored
Allow a target assembly parser to do context sensitive constraint checking on a potential instruction match. This will be used, for example, to handle Thumb2 IT block parsing. llvm-svn: 137675
-
- Aug 15, 2011
-
-
Bruno Cardoso Lopes authored
when AVX mode is one. Otherwise is just more work for the type legalizer. llvm-svn: 137661
-
- Aug 12, 2011
-
-
Bruno Cardoso Lopes authored
llvm-svn: 137521
-
Bruno Cardoso Lopes authored
vectors. It operates on 128-bit elements instead of regular scalar types. Recognize shuffles that are suitable for VPERM2F128 and teach the x86 legalizer how to handle them. llvm-svn: 137519
-
Bruno Cardoso Lopes authored
llvm-svn: 137518
-
Duncan Sands authored
when building with assertions disabled. llvm-svn: 137460
-
Andrew Trick authored
Fix by Ivan Baev. Sorry I don't have a unit test, but the fix is obvious so I don't want to delay it. llvm-svn: 137404
-
- Aug 11, 2011
-
-
Bruno Cardoso Lopes authored
inserts and extracts. This simple combine makes us generate only 1 instruction instead of 11 in the v8 case. llvm-svn: 137362
-
Bruno Cardoso Lopes authored
llvm-svn: 137324
-
Nadav Rotem authored
llvm-svn: 137313
-
Nadav Rotem authored
(for example, after integer operation), do not pack the registers into a YMM before saving. Its better to save as two XMM registers. Before: vinsertf128 $1, %xmm3, %ymm0, %ymm3 vinsertf128 $0, %xmm1, %ymm3, %ymm1 vmovaps %ymm1, 416(%rsp) After: vmovaps %xmm3, 416+16(%rsp) vmovaps %xmm1, 416(%rsp) llvm-svn: 137308
-
Bruno Cardoso Lopes authored
llvm-svn: 137297
-
Bruno Cardoso Lopes authored
infinite recursive calls in legalize. Fix PR10562 llvm-svn: 137296
-
Bruno Cardoso Lopes authored
could only get undefs and the vector shuffle becomes an undef, generating wrong code. llvm-svn: 137295
-
Eli Friedman authored
Fix X86TargetLowering::LowerExternalSymbol so that it actually works in non-trivial cases. This hasn't been an issue before because the function isn't normally called (but apparently is used to generate a tail-call to sin() on ELF x86-32 with PIC and SSE2). Fixes PR9693. llvm-svn: 137292
-
- Aug 10, 2011
-
-
Nadav Rotem authored
data in-register prior to saving to memory. When we reorder the data in memory we prevent the need to save multiple scalars to memory, making a single regular store. llvm-svn: 137238
-
Bruno Cardoso Lopes authored
def : Pat<(X86Movss VR128:$src1, (bc_v4i32 (v2i64 (load addr:$src2)))), (MOVLPSrm VR128:$src1, addr:$src2)>; This matches a MOVSS dag with a MOVLPS instruction. However, MOVSS will replace only the low 32 bits of the register, while the MOVLPS instruction will replace the low 64 bits. A testcase is added and illustrates the bug and also modified the one that was already present. Patch by Tanya Lattner. llvm-svn: 137227
-
Bruno Cardoso Lopes authored
llvm-svn: 137194
-
Bruno Cardoso Lopes authored
llvm-svn: 137179
-
Bruno Cardoso Lopes authored
llvm-svn: 137166
-
Bruno Cardoso Lopes authored
is the best we can do for these patterns. This fix PR10554. llvm-svn: 137161
-
-
- Aug 09, 2011
-
-
Bruno Cardoso Lopes authored
v4f64 = sitofp v4i32. This fix PR10559. Also add support for v4i32 = fptosi v4f64. llvm-svn: 137128
-
Bruno Cardoso Lopes authored
llvm-svn: 137127
-
Bruno Cardoso Lopes authored
llvm-svn: 137114
-