- Apr 24, 2007
-
-
Owen Anderson authored
my approach to this, so hopefully I'll find a way to do this without making this slower. llvm-svn: 36392
-
Devang Patel authored
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070423/048333.html llvm-svn: 36380
-
- Apr 21, 2007
-
-
Owen Anderson authored
llvm-svn: 36300
-
Owen Anderson authored
llvm-svn: 36299
-
Jeff Cohen authored
llvm-svn: 36287
-
- Apr 20, 2007
-
-
Devang Patel authored
llvm-svn: 36272
-
Owen Anderson authored
llvm-svn: 36271
-
- Apr 19, 2007
-
-
Zhou Sheng authored
llvm-svn: 36261
-
Zhou Sheng authored
llvm-svn: 36260
-
Evan Cheng authored
llvm-svn: 36258
-
- Apr 18, 2007
-
-
Owen Anderson authored
llvm-svn: 36255
-
Owen Anderson authored
llvm-svn: 36254
-
Owen Anderson authored
llvm-svn: 36252
-
Owen Anderson authored
llvm-svn: 36249
-
Owen Anderson authored
llvm-svn: 36248
-
Owen Anderson authored
llvm-svn: 36247
-
- Apr 17, 2007
-
-
Dan Gohman authored
gets called. llvm-svn: 36208
-
Chris Lattner authored
llvm-svn: 36205
-
Chris Lattner authored
llvm-svn: 36202
-
Chris Lattner authored
llvm-svn: 36200
-
Chris Lattner authored
llvm-svn: 36199
-
Devang Patel authored
Fix http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070416/047888.html llvm-svn: 36182
-
- Apr 16, 2007
-
-
Anton Korobeynikov authored
target for tabs checking. llvm-svn: 36146
-
- Apr 15, 2007
-
-
Chris Lattner authored
llvm-svn: 36090
-
Owen Anderson authored
Remove ImmediateDominator analysis. The same information can be obtained from DomTree. A lot of code for constructing ImmediateDominator is now folded into DomTree construction. This is part of the ongoing work for PR217. llvm-svn: 36063
-
Chris Lattner authored
llvm-svn: 36047
-
Chris Lattner authored
This sinks the two stores in this example into a single store in cond_next. In this case, it allows elimination of the load as well: store double 0.000000e+00, double* @s.3060 %tmp3 = fcmp ogt double %tmp1, 5.000000e-01 ; <i1> [#uses=1] br i1 %tmp3, label %cond_true, label %cond_next cond_true: ; preds = %entry store double 1.000000e+00, double* @s.3060 br label %cond_next cond_next: ; preds = %entry, %cond_true %tmp6 = load double* @s.3060 ; <double> [#uses=1] This implements Transforms/InstCombine/store-merge.ll:test2 llvm-svn: 36040
-
Chris Lattner authored
llvm-svn: 36037
-
Chris Lattner authored
llvm-svn: 36031
-
Chris Lattner authored
define i32 @test(float %f) { %tmp7 = insertelement <4 x float> undef, float %f, i32 0 %tmp17 = bitcast <4 x float> %tmp7 to <4 x i32> %tmp19 = extractelement <4 x i32> %tmp17, i32 0 ret i32 %tmp19 } into: define i32 @test(float %f) { %tmp19 = bitcast float %f to i32 ; <i32> [#uses=1] ret i32 %tmp19 } On PPC, this is the difference between: _test: mfspr r2, 256 oris r3, r2, 8192 mtspr 256, r3 stfs f1, -16(r1) addi r3, r1, -16 addi r4, r1, -32 lvx v2, 0, r3 stvx v2, 0, r4 lwz r3, -32(r1) mtspr 256, r2 blr and: _test: stfs f1, -4(r1) nop nop nop lwz r3, -4(r1) blr llvm-svn: 36025
-
Chris Lattner authored
unsigned test(float f) { return _mm_cvtsi128_si32( (__m128i) _mm_set_ss( f*f )); } into: _test: movss 4(%esp), %xmm0 mulss %xmm0, %xmm0 movd %xmm0, %eax ret instead of: _test: movss 4(%esp), %xmm0 mulss %xmm0, %xmm0 xorps %xmm1, %xmm1 movss %xmm0, %xmm1 movd %xmm1, %eax ret GCC gets: _test: subl $28, %esp movss 32(%esp), %xmm0 mulss %xmm0, %xmm0 xorps %xmm1, %xmm1 movss %xmm0, %xmm1 movaps %xmm1, %xmm0 movd %xmm0, 12(%esp) movl 12(%esp), %eax addl $28, %esp ret llvm-svn: 36020
-
Chris Lattner authored
llvm-svn: 36017
-
- Apr 14, 2007
-
-
Chris Lattner authored
llvm-svn: 36002
-
Jeff Cohen authored
llvm-svn: 35998
-
Jeff Cohen authored
llvm-svn: 35996
-
Chris Lattner authored
printf("") -> noop. Still need to do the xforms for fprintf. This implements Transforms/SimplifyLibCalls/Printf.ll llvm-svn: 35984
-
Chris Lattner authored
in order to clean up after simplifylibcalls. llvm-svn: 35982
-
Chris Lattner authored
llvm-svn: 35981
-
Chris Lattner authored
llvm-svn: 35979
-
- Apr 13, 2007
-
-
Chris Lattner authored
out to do! :) This fixes a problem where LSR would insert a bunch of code into each MBB that uses a particular subexpression (e.g. IV+base+C). The problem is that this code cannot be CSE'd back together if inserted into different blocks. This patch changes LSR to attempt to insert a single copy of this code and share it, allowing codegenprepare to duplicate the code if it can be sunk into various addressing modes. On CodeGen/ARM/lsr-code-insertion.ll, for example, this gives us code like: add r8, r0, r5 str r6, [r8, #+4] .. ble LBB1_4 @cond_next LBB1_3: @cond_true str r10, [r8, #+4] LBB1_4: @cond_next ... LBB1_5: @cond_true55 ldr r6, LCPI1_1 str r6, [r8, #+4] instead of: add r10, r0, r6 str r8, [r10, #+4] ... ble LBB1_4 @cond_next LBB1_3: @cond_true add r8, r0, r6 str r10, [r8, #+4] LBB1_4: @cond_next ... LBB1_5: @cond_true55 add r8, r0, r6 ldr r10, LCPI1_1 str r10, [r8, #+4] Besides being smaller and more efficient, this makes it immediately obvious that it is profitable to predicate LBB1_3 now :) llvm-svn: 35972
-