- Aug 24, 2011
-
-
Eli Friedman authored
llvm-svn: 138487
-
Eli Friedman authored
llvm-svn: 138478
-
Craig Topper authored
Break 256-bit vector int add/sub/mul into two 128-bit operations to avoid costly scalarization. Fixes PR10711. llvm-svn: 138427
-
Bruno Cardoso Lopes authored
permutations. Also tidy up some patterns and make them close to their instruction definition! llvm-svn: 138392
-
- Aug 23, 2011
-
-
Nick Lewycky authored
llvm-svn: 138354
-
Craig Topper authored
Add support for breaking 256-bit v16i16 and v32i8 VSETCC into two 128-bit ones, avoiding sclarization. Add vex form of pcmpeqq and pcmpgtq. Fixes more cases for PR10712. llvm-svn: 138321
-
- Aug 22, 2011
-
-
Bruno Cardoso Lopes authored
avoding scalarization of the compare. Reduces code from 59 to 6 instructions. Fix PR10712. llvm-svn: 138271
-
- Aug 18, 2011
-
-
Bruno Cardoso Lopes authored
shift amount is variable llvm-svn: 137885
-
- Aug 17, 2011
-
-
Bruno Cardoso Lopes authored
match splats in the form (splat (scalar_to_vector (load ...))) whenever the load can be folded. All the logic and instruction emission is working but because of PR8156, there are no ways to match loads, cause they can never be folded for splats. Thus, the tests are XFAILed, but I've tested and exercised all the logic using a relaxed version for checking the foldable loads, as if the bug was already fixed. This should work out of the box once PR8156 gets fixed since MayFoldLoad will work as expected. llvm-svn: 137810
-
Bruno Cardoso Lopes authored
llvm-svn: 137808
-
Bruno Cardoso Lopes authored
vinsertf128 $1 + vpermilps $0, remove the old code that used to first do the splat in a 128-bit vector and then insert it into a larger one. This is better because the handling code gets simpler and also makes a better room for the upcoming vbroadcast! llvm-svn: 137807
-
- Aug 16, 2011
-
-
Bruno Cardoso Lopes authored
there is no support for native 256-bit shuffles, be more smart in some cases, for example, when you can extract specific 128-bit parts and use regular 128-bit shuffles for them. Example: For this shuffle: shufflevector <4 x i64> %a, <4 x i64> %b, <4 x i32> <i32 1, i32 0, i32 7, i32 6> This was expanded to: vextractf128 $1, %ymm1, %xmm2 vpextrq $0, %xmm2, %rax vmovd %rax, %xmm1 vpextrq $1, %xmm2, %rax vmovd %rax, %xmm2 vpunpcklqdq %xmm1, %xmm2, %xmm1 vpextrq $0, %xmm0, %rax vmovd %rax, %xmm2 vpextrq $1, %xmm0, %rax vmovd %rax, %xmm0 vpunpcklqdq %xmm2, %xmm0, %xmm0 vinsertf128 $1, %xmm1, %ymm0, %ymm0 ret Now we get: vshufpd $1, %xmm0, %xmm0, %xmm0 vextractf128 $1, %ymm1, %xmm1 vshufpd $1, %xmm1, %xmm1, %xmm1 vinsertf128 $1, %xmm1, %ymm0, %ymm0 llvm-svn: 137733
-
- Aug 15, 2011
-
-
Bruno Cardoso Lopes authored
when AVX mode is one. Otherwise is just more work for the type legalizer. llvm-svn: 137661
-
- Aug 12, 2011
-
-
Bruno Cardoso Lopes authored
llvm-svn: 137521
-
Bruno Cardoso Lopes authored
vectors. It operates on 128-bit elements instead of regular scalar types. Recognize shuffles that are suitable for VPERM2F128 and teach the x86 legalizer how to handle them. llvm-svn: 137519
-
- Aug 11, 2011
-
-
Bruno Cardoso Lopes authored
inserts and extracts. This simple combine makes us generate only 1 instruction instead of 11 in the v8 case. llvm-svn: 137362
-
Bruno Cardoso Lopes authored
llvm-svn: 137324
-
Nadav Rotem authored
llvm-svn: 137313
-
Nadav Rotem authored
(for example, after integer operation), do not pack the registers into a YMM before saving. Its better to save as two XMM registers. Before: vinsertf128 $1, %xmm3, %ymm0, %ymm3 vinsertf128 $0, %xmm1, %ymm3, %ymm1 vmovaps %ymm1, 416(%rsp) After: vmovaps %xmm3, 416+16(%rsp) vmovaps %xmm1, 416(%rsp) llvm-svn: 137308
-
Bruno Cardoso Lopes authored
infinite recursive calls in legalize. Fix PR10562 llvm-svn: 137296
-
Bruno Cardoso Lopes authored
could only get undefs and the vector shuffle becomes an undef, generating wrong code. llvm-svn: 137295
-
Eli Friedman authored
Fix X86TargetLowering::LowerExternalSymbol so that it actually works in non-trivial cases. This hasn't been an issue before because the function isn't normally called (but apparently is used to generate a tail-call to sin() on ELF x86-32 with PIC and SSE2). Fixes PR9693. llvm-svn: 137292
-
- Aug 10, 2011
-
-
Nadav Rotem authored
data in-register prior to saving to memory. When we reorder the data in memory we prevent the need to save multiple scalars to memory, making a single regular store. llvm-svn: 137238
-
Bruno Cardoso Lopes authored
llvm-svn: 137194
-
Bruno Cardoso Lopes authored
llvm-svn: 137179
-
Bruno Cardoso Lopes authored
is the best we can do for these patterns. This fix PR10554. llvm-svn: 137161
-
- Aug 09, 2011
-
-
Bruno Cardoso Lopes authored
llvm-svn: 137127
-
Bruno Cardoso Lopes authored
llvm-svn: 137114
-
Bruno Cardoso Lopes authored
llvm-svn: 137090
-
- Aug 08, 2011
-
-
Bruno Cardoso Lopes authored
llvm-svn: 137067
-
- Aug 04, 2011
-
-
Evan Cheng authored
llvm-svn: 136899
-
Bill Wendling authored
Fixes PR10527. llvm-svn: 136853
-
- Aug 03, 2011
-
-
Benjamin Kramer authored
llvm-svn: 136803
-
- Aug 02, 2011
-
-
Eli Friedman authored
The testcase looks extremely fragile, so I'm adding an assertion which should catch any cases like this. llvm-svn: 136711
-
Bruno Cardoso Lopes authored
shuffle (scalar_to_vector (load (ptr + 4))), undef, <0, 0, 0, 0> To: shuffle (vload ptr)), undef, <1, 1, 1, 1> Fix PR10494 llvm-svn: 136691
-
- Aug 01, 2011
-
-
Bruno Cardoso Lopes authored
the legalizer. This commit together with the two previous ones fixes PR10495. llvm-svn: 136654
-
Bruno Cardoso Lopes authored
llvm-svn: 136653
-
Bruno Cardoso Lopes authored
using a stack store. llvm-svn: 136652
-
Bruno Cardoso Lopes authored
avoid returning early for v8i32 types, which would only be valid for vector with all zeros. Also split the handling of zeros and ones into separate checking logic since they are handled differently. This fixes PR10547 llvm-svn: 136642
-
- Jul 29, 2011
-
-
Eli Friedman authored
working on x86 (at least for trivial testcases); other architectures will need more work so that they actually emit the appropriate instructions for orderings stricter than 'monotonic'. (As far as I can tell, the ARM, PPC, Mips, and Alpha backends need such changes.) llvm-svn: 136457
-